In pursuit of happiness!: 2014

Sunday, December 21, 2014

Codahale Metrics and Spring

This is a revision of my previous post on Coda Hale Metrics available here.

Goal: Integrate Metrics (v 3.1.0) and Spring (v 4.1.x) in a JEE environment.

Metrics-spring resides here. Code snippets to help you get started below:

1. Add the following dependency in pom.xml


<dependency>
     <groupid>io.dropwizard.metrics</groupid>
     <artifactid>metrics-servlets</artifactid>
     <version>3.1.0</version>
</dependency>
<dependency>
     <groupid>com.ryantenney.metrics</groupid>
     <artifactid>metrics-spring</artifactid>
     <version>3.0.3</version>
</dependency>
..
Other metrics dependencies

2. Add the following in web.xml


<servlet>
 <servlet-name>metrics-admin</servlet-name>
 <servlet-class>com.codahale.metrics.servlets.AdminServlet</servlet-class>
</servlet>
<servlet-mapping>
 <servlet-name>metrics-admin</servlet-name>
 <url-pattern>/metrics/admin/*</url-pattern>
</servlet-mapping>

3. Define metrics.xml and include it in your main spring configuration file.


<beans xmlns="http://www.springframework.org/schema/beans"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:metrics="http://www.ryantenney.com/schema/metrics"
    xsi:schemaLocation="http://www.springframework.org/schema/beans
            http://www.springframework.org/schema/beans/spring-beans.xsd
               http://www.ryantenney.com/schema/metrics
               http://www.ryantenney.com/schema/metrics/metrics-3.0.xsd">

    <!-- Registry should be defined in only one context XML file -->
    <metrics:metric-registry id="metrics" />

    <metrics:health-check-registry id="healthCheck" />

    <!-- annotation-driven must be included in all context files -->
    <metrics:annotation-driven metric-registry="metrics"
        health-check-registry="healthCheck" />

    <!-- (Optional) Registry should be defined in only one context XML file -->
    <metrics:reporter type="console" metric-registry="metrics"
        period="1m" />

    <bean
        class="org.springframework.web.context.support.ServletContextAttributeExporter">
        <property name="attributes">
            <map>
                <entry key="com.codahale.metrics.servlets.MetricsServlet.registry">
                    <ref bean="metrics" />
                </entry>
                <entry key="com.codahale.metrics.servlets.HealthCheckServlet.registry">
                    <ref bean="healthCheck" />
                </entry>
            </map>
        </property>
    </bean>
</beans>

4. Define HealthCheck classes and annotate methods with @Timed as per Metrics documentation. And we are done!

Navigate to: http://hostname:port/<webappname>/metrics/admin to view "Operational Menu"

The output metrics is dumped in JSON format. You can parse it and pass it to your favorite graphing library.

Note: All metrics are reported via Console. You can easily enable other reporters in metrics.xml.

Friday, October 17, 2014

Stuff Software Engineers should know about product management!

Success will most likely come your way if you have a great Product Manager in the team. However, not all teams are that lucky. In some team setups, senior tech leads or architects also double up as product managers. Even otherwise, it is good to know some product management to take sound decisions in your architecture, code, deployment and metrics.

Here are some tips to build a great product when software engineers manage the product:

1. Know your customers and their problems.

Who is the end user? Meet and talk to them even if they are sitting in a warehouse!

Ask if you should include feature X before they can use it? Is this an incremental value addition or a game changer? This will get you to MVP quickly.

2. What is your delivery channel?
Mobile, web or multichannel presence. Do your research and pick one to start with.

3. List and estimate non-functional requirements

How many concurrent users? Acceptable latency? Downtime? You get the gist.

4. Release fast and release often

Take customer feedback and iterate. Building the most awesome product takes time. And time is often a luxury.

Releasing product often keeps you on track. Intensity and quality of the feedback helps you prioritize the next set of features to code. Feature prioritization is the key. It also ensure that the team is productive.

If your customers say "I want to use it today", you are on track. Make the feature available to them in pre-production.

Such interaction also saves you from big software re-architectures and re-implementation.

Caution: Ensure that your customers spend serious time interacting with your software.

5. Measure your success (or failure!)

In helps you analyze what's working and what's not. Get rid of later and reinforce the former. Set goals and strive towards it.

If you can measure metrics like conversion, sales, traffic, API calls etc in real time, nothing like it!

6. Experiment
Fear not! Capture those new ideas from the team.Let the numbers and your intuition find its way to newer and successful experiments.

That one new feature can be the game changer!

7. Provide good customer service!

Monitor your application and support it. FAQs, Documentation, Tickets, Training etc are some options to chose from.

And may you become a great product manager!

Wednesday, October 15, 2014

Software stack deployment automation the Amazon CloudFormation way!

Deploying a complete software stack is breeze with Amazon's CloudFormation. It provides a repeatable and predictable mechanism to launch a stack comprising of EC2 instances, load balancers, databases and other Amazon resources.

There are many sample CloudFormation (CF) recipes to get you started, LAMP recipe being the simplest one. Once the hardware is setup, you can use CloudFormation to bootstrap applications. This is where is gets a little tricky.

Multiple options are available including CloudInit. You can pass executable actions to instances at launch time through EC2 user-data attribute. Other options are Chef and Puppet.

Some scenarios where CloudFormation can be of great help:

You own a simple web application running on Tomcat and fronted by a load balancer. New engineers joining your team can simply run the CF template to create their own developer environment on the cloud.
You own an application that builds on multi-instance architectures. Multi-instance architectures separate software instances (or hardware systems) operate on behalf of different client organizations. Launching a new stack for a new client becomes just too simple!
Continuous delivery is simplified via use of CF. Launch a new stack for QA for Feature 1. However, Feature 2 will have to wait deployment till QA has certified Feature 1. Launch a new stack for Feature 2 and build parallel continuous delivery pipelines.

Do remember to delete the stack once it is not needed!

Monday, August 4, 2014

Pray!

Right off YouTube -

"Please pray for my friend's 4 year old son. I just found out that 4 minutes of his life was not documented on Facebook."

Saturday, June 7, 2014

Collocation extraction using NLTK

A collocation is an expression consisting of two or more words that correspond to some conventional way of saying things. Collocations include noun phrases like strong tea and weapons of mass destruction, phrasal verbs like to make up, and other stock phrases like the rich and powerful.

Collocations are important for a number of applications: natural language generation (to make sure that the output sounds natural and mistakes like powerful tea or to take a decision are avoided), computational lexicography (to automatically identify the important collocations to be listed in a dictionary entry), parsing (so that preference can be given to parses with natural collocations), and corpus linguistic research.

2 Collocation algorithms:

a. The simplest method for ﬁnding collocations in a text corpus is counting. Pass the candidate phrases through a part-of-speech ﬁlter which only lets through those patterns that are likely to be “phrases”.

b. Mean and variance based methods work by looking at the pattern of varying distance between two words.

We also want to know is whether two words occur together more often than chance. Assessing whether or not something is a chance event is one of the classical problems of statistics. It is usually couched in terms of hypothesis testing. The options are t-test, chi-square, PMI, likelihood ratios, Poisson-Stirling measure and Jaccard index.

Likelihood ratios is an effective approach to hypothesis testing for extracting collocations. You can read more about it here.

Now, to some code to see how collocations can be extracted. Collocation processing modules are available in Apache Mahout, NLTK and Stanford NLP. In this article, we will use NLTK library written in Python to extract bigram collocations from a set of documents.


'''
@author: Nishant
'''

import nltk
from nltk.collocations import *

from com.nishant import DataLoader

if __name__ == '__main__':

    tokens = []
    stopwords = nltk.corpus.stopwords.words('english')

    '''Load multiple documents and return it as a list. 
       Provide your implementation here'''
    data = DataLoader.load("data.xml")

    print 'Total documents loaded: %d' % (data.len)

    ''' Apply stop words '''
    for d in data:
        data_tokens = nltk.wordpunct_tokenize(d)
        filtered_tokens = [w for w in data_tokens if w.lower() not in stopwords]
        tokens = tokens + filtered_tokens

    print 'Total tokens loaded: %d' % (tokens.len)
    print 'Calculating Collocations'

    ''' Extract only bigrams within a window of 5. 
        Implementation for trigram also available. NLTK
        has utility functions for frequency, ngram and word filter''' 
    bigram_measures = nltk.collocations.BigramAssocMeasures()
    finder = BigramCollocationFinder.from_words(tokens, 5)
    finder.apply_freq_filter(5)

    ''' Return the 1000 n-grams with the highest likelihood ratio.
        You can also use PMI or other methods and compare results '''
    print 'Printing Top 1000 Collocations''
    print finder.nbest(bigram_measures.likelihood_ratio, 1000)


Output >>
('side', 'neck'), ('stomach', 'pain'), ('doctor', 'suggested'), ('loss', 'appetite') .....

NLTK documentation for collocations are available here.
Must read on this topic - http://nlp.stanford.edu/fsnlp/promo/colloc.pdf

Monday, April 21, 2014

The sweetness of developing REST services using Dropwizard

Jersey is my goto software for developing REST services and then I came across Dropwizard. It is a simple and neat framework written on top of Jersey and glues together all essential libraries for creating production ready services.

Before externalizing a web service, it must be operationally ready to take real world traffic and provide HA. So many engineers end up writing health checks, enabling the required logs and metrics metrics etc. All of these features are available out of the box in Dropwizard. Furthermore, it has nice features such as HTTP client for invoking other services, authentication, integration with Hibernate and DI.

So, while the documentation is straight forward and 'Getting Started' guide does get you started, integration with Spring and JPA via Hibernate requires some work. This article will help you with that assuming you have a working service written using Dropwizard.

1. Wiring up Spring and Dropwizard. HelloWorldApplication is the entry point to the Dropwizard application. The run method is where we initialize Spring.



   @Override
   public void run(HelloWorldConfiguration configuration,
   Environment environment) {

  //init Spring context
        //before we init the app context, we have to create a parent context with all the config objects others rely on to get initialized
        AnnotationConfigWebApplicationContext parent = new AnnotationConfigWebApplicationContext();
        AnnotationConfigWebApplicationContext ctx = new AnnotationConfigWebApplicationContext();

        parent.refresh();
        parent.getBeanFactory().registerSingleton("configuration", configuration);
        parent.registerShutdownHook();
        parent.start();

        //the real main app context has a link to the parent context
        ctx.setParent(parent);
        ctx.register(MyAppSpringConfiguration.class);
        ctx.refresh();
        ctx.registerShutdownHook();
        ctx.start();

        //now that Spring is started, let's get all the beans that matter into DropWizard

        //health checks
        Map healthChecks = ctx.getBeansOfType(HealthCheck.class);
        for(Map.Entry entry : healthChecks.entrySet()) {
            environment.healthChecks().register("template", entry.getValue());
        }

        //resources
        Map resources = ctx.getBeansWithAnnotation(Path.class);
        for(Map.Entry entry : resources.entrySet()) {
            environment.jersey().register(entry.getValue());
        }

        //last, but not least,let's link Spring to the embedded Jetty in Dropwizard
        environment.servlets().addServletListeners(new SpringContextLoaderListener(ctx));
}

It is a standard way to initialize Spring. Two important points to note here. One, no need to register resources explicitly. Spring will look for classes annotated with @PATH and register it with Dropwizard. Second is to include MyAppSpringConfiguration which provides additional resources to be included.

2. Definition for MyAppSpringConfiguration is below:



/**
Main Spring Configuration
 */
@Configuration
@ImportResource({ "classpath:spring/applicationContext.xml", "classpath:spring/dao.xml" })
@ComponentScan(basePackages = {"com.nishant.example"})
public class MyAppSpringConfiguration {

}

3. Next, annotate the resource class with @Service annotation and we are ready!


@Service
@Path("/hello-world")
@Produces(MediaType.APPLICATION_JSON)
public class HelloWorldResource {

 @Autowired
 private HelloWorldConfiguration configuration;

    private final AtomicLong counter = new AtomicLong();

    @GET
    @Timed
    public Saying sayHello(@QueryParam("name") Optional name) {

        final String value = String.format(configuration.getTemplate(), name.or(configuration.getDefaultName()));
        return new Saying(counter.incrementAndGet(), value);
    }
}

For complete working example, see https://github.com/nchandra/SampleDropwizardService

Download Android App