my picture



My Bio Research Code Discourse

I am a Research Scientist with the Text, Language and Computation research group at the Educational Testing Service. I obtained my Ph.D. from the Department of Computer Science at University of Maryland, College Park. I was also a graduate research assistant in the Computational Linguistics and Information Processing Laboratory at the Institute for Advanced Computer Studies, where I worked with my advisor, Bonnie Dorr.

In general, my research has been focused on building systems—with underlying computational models of language—that process written text so as to enhance our experience with those texts. As an example, in my dissertation, I demonstrated the existence of a concrete symbiotic semantic relationship between systems that translate text and those that paraphrase text; I then exploited this relationship to build better translation and paraphrasing systems. Besides machine translation and paraphrase generation, I have also worked on automatic summarization and information retrieval systems. I am also particularly interested in information and data visualization and over the last few years, I have worked on some interesting projects such as visualizing poetry for humanity scholars and interactive scoring for statistical machine translation systems.

My work at ETS has allowed to me to apply NLP techniques to build useful educational applications and technologies. Some examples include mining Wikipedia revision history to correct grammatical errors, using paraphrase generation to improve sentiment analysis of essay data, and automatically detecting organizational elements in argumentative discourse.

I have also had a great time teaching computer science during my days as a graduate student and I am hoping to be involved with more of that in the future.

2014
An Explicit Feedback System for Preposition Errors based on Wikipedia Revisions. 2014. In Proc. 9th Workshop on Innovative Use of Natural Language Processing for Building Educational Applications (BEA). Nitin Madnani and Aoife Cahill. [pdf]  [bib]
Content Importance Models for Scoring Writing From Sources. 2014. In Proc. ACL (short papers). Beata Beigman Klebanov, Nitin Madnani, Jill Burstein and Swapna Somasundaran. [pdf]  [bib]
Predicting Grammaticality on an Ordinal Scale. 2014. In Proc. ACL (short papers). Michael Heilman, Aoife Cahill, Nitin Madnani, Melissa Lopez, Matthew Mulholland and Joel Tetreault. [pdf]  [bib]
Bucking the Trend: Improved Evaluation and Anotation Practices for ESL Error Detection Systems. In Language Resources & Evaluation: Special Issue on Resources and Tools for Language Learners, 48(1). Joel Tetreault, Martin Chodorow and Nitin Madnani.
2013
ParaQuery: Making Sense of Paraphrase Collections. 2013. In Proc. ACL (demos). Lili Kotlerman, Nitin Madnani and Aoife Cahill. [pdf]  [bib]
Detecting Missing Hyphens in Learner Text. 2013. In Proc. 8th Workshop on Innovative Use of Natural Language Processing for Building Educational Applications (BEA). Aoife Cahill, Martin Chodorow, Susanne Wolff and Nitin Madnani. [pdf]  [bib]
Automated Scoring of a Summary-Writing Task Designed to Measure Reading Comprehension. 2013. In Proc. 8th Workshop on Innovative Use of Natural Language Processing for Building Educational Applications (BEA). Nitin Madnani, Jill Burstein, John Sabatini and Tenaha O'Reilly. [pdf] [bib]
Robust Systems for Preposition Error Correction Using Wikipedia Revisions. 2013. In Proc. NAACL. Aoife Cahill, Nitin Madnani, Joel Tetreault and Diane Napolitano. [pdf] [bib]
HENRY-CORE: Domain Adaptation and Stacking for Text Similarity. 2013. In Proc. of the Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity. Michael Heilman and Nitin Madnani. [pdf] [bib]
ETS: Domain Adaptation and Stacking for Short Answer Scoring. 2013. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval). [Note: This version fixes some errors in the official ACL Anthology version.] Michael Heilman and Nitin Madnani. [pdf] [bib]
Using Pivot-based Paraphrasing and Sentiment Profiles to Improve a Subjectivity Lexicon for Essay Data. 2013. Transactions of the Association for Computational Linguistics, 1(2013):99−110. Beata Beigman Klebanov, Nitin Madnani and Jill Burstein. [pdf] [bib]
Sentiment Profiles of Multi-Word Expressions in Test-Taker Essays: The Case of Noun-Noun Compounds. 2013. ACM Transactions on Speech and Language Processing, 10(3):12. Beata Beigman Klebanov, Jill Burstein and Nitin Madnani.
Sentiment Analysis and Detection for Essay Evaluation. 2013. Jill Burstein, Beata Beigman Klebanov, Nitin Madnani and Adam Faulkner. Handbook for Automated Essay Scoring, Mark D. Shermis and Jill Burstein (eds.), Taylor and Francis. [link]
The E-rater Automated Essay Scoring System. Jill Burstein, Joel Tetreault and Nitin Madnani. 2013. Handbook for Automated Essay Scoring, Mark D. Shermis and Jill Burstein (eds.), Taylor and Francis. [link]
Generating Targeted Paraphrases for Improved Translation. 2013. ACM Transactions on Intelligent Systems and Technology, 4(3). Nitin Madnani and Bonnie Dorr. [pdf] [bib]
2012
Topical Trends in a Corpus of Persuasive Writing. 2012. ETS Research Report Series, RR-12-19. Michael Heilman and Nitin Madnani. [pdf] [bib]
Discriminative Edit Models for Paraphrase Scoring. 2012. In Proc. of the 6th International Workshop on Semantic Evaluation (SemEval). Michael Heilman and Nitin Madnani. [pdf] [bib]
Exploring Grammatical Error Correction with Not-So-Crummy Machine Translation. 2012. In Proc. 7th Workshop on Innovative Use of Natural Language Processing for Building Educational Applications (BEA). Nitin Madnani, Joel Tetreault and Martin Chodorow. [pdf] [bib]
Re-examining Machine Translation Metrics for Paraphrase Identification. 2012. In Proc. NAACL. Nitin Madnani, Joel Tetreault and Martin Chodorow. [pdf] [bib] [zip]
Identifying High Level Organizational Elements in Argumentative Discourse. 2012. In Proc. NAACL. Nitin Madnani, Michael Heilman, Joel Tetreault and Martin Chodorow. [pdf] [bib]
Building Subjectivity Lexicon(s) from Scratch for Essay Data. 2012. In Proc. CICLing. Beata Beigman Klebanov, Jill Burstein, Nitin Madnani and Adam Faulkner. [pdf] [bib]
2011
iBLEU: Interactively Scoring and Debugging Statistical Machine Translation Systems. 2011. In Proc. Fifth IEEE International Conference on Semantic Computing (Demos). Nitin Madnani. [pdf] [bib] [website]
E-rating Machine Translation. 2011. In Proc. WMT. Kristen Parton, Joel Tetreault, Nitin Madnani, Martin Chodorow. [pdf] [bib]
They Can Help: Using Crowdsourcing to Improve the Evaluation of Grammatical Error Detection Systems. 2011. In Proc. ACL (Short Papers). Nitin Madnani, Joel Tetreault, Martin Chodorow and Alla Rozovskaya. [pdf] [bib] [tgz]
The Web is not a PERSON, Berners-Lee is not an ORGANIZATION, and African-Americans are not LOCATIONS: An Analysis of the Performance of Named-Entity Recognition. 2011. In Proc. Workshop on Multiword Expressions. Bob Krovetz, Paul Deane and Nitin Madnani. [pdf] [bib]
2010
Machine Translation Evaluation and Optimization. 2010. Handbook of Natural Language Processing and Machine Translation. Joseph Olive, John McCary, and Caitlin Christianson (eds.) Yaser Al-Onaisan, Bonnie Dorr, Doug Jones, Jeremy Kahn, Seth Kulick, Alon Lavie, Gregor Leusch, Nitin Madnani, Chris Manning, Arne Mauser, Alok Parlikar, Mark Przybocki, Rich Schwartz, Matthew Snover, Stephan Vogel and Clare Voss. [html]
Putting the User in the Loop: Interactive Maximal Marginal Relevance for Query-Focused Summarization. 2010. In Proc. NAACL (Short Papers). Jimmy Lin, Nitin Madnani and Bonnie J. Dorr. [pdf] [bib]
Measuring Transitivity using Untrained Annotators. 2010. In Proc. Workshop on Creating Speech and Language Data With Amazon’s Mechanical Turk. Nitin Madnani, Jordan Boyd-Graber and Philip Resnik. [pdf] [bib]
Generating Phrasal & Sentential Paraphrases: A Survey of Data-Driven Methods. 2010. Computational Linguistics, 36(3):341-387. Nitin Madnani and Bonnie Dorr. [pdf] [bib]
The Circle of Meaning: From Translation to Paraphrasing and Back. 2010. Doctoral Dissertation. Department of Computer Science. University of Maryland College Park. [pdf] [bib]
The Python and The Elephant: Large Scale Natural Language Processing with NLTK and Dumbo. 2010. In Proc. of the Eighth Annual Python Conference. Nitin Madnani and Jimmy Lin. [video] [bib] [zip]
TER-Plus: Paraphrase, Semantic, and Alignment Enhancements to Translation Edit Rate. 2010. Machine Translation, 23(2-3):117-127. Matthew Snover, Nitin Madnani, Bonnie Dorr and Richard Schwartz. [html] [bib]
2009
Fluency, Adequacy, or HTER? Exploring Different Human Judgments with a Tunable MT Metric. 2009. In Proc. WMT. Matthew Snover, Nitin Madnani, Bonnie Dorr and Richard Schwartz. [pdf] [bib]
Querying and Serving N-gram Language Models with Python. 2009. The Python Papers. Volume 4, Issue 2. Nitin Madnani. [pdf] [bib]
Source Code: Querying and Serving N-gram Language Models with Python. 2009. The Python Papers Source Codes. Volume 1. Nitin Madnani. [pdf] [bib]
2008
Applying Automatically Generated Semantic Knowledge: A Case Study in Machine Translation. 2008. In Proc. of the Symposium on Semantic Knowledge Discovery, Organization and Use. Nitin Madnani, Philip Resnik, Bonnie Dorr and Richard Schwartz. [pdf] [bib] [poster]
Are Multiple Reference Translations Necessary? Investigating the Value of Paraphrased Reference Translations in Parameter Optimization. 2008. In Proc. AMTA. Nitin Madnani, Philip Resnik, Bonnie Dorr and Richard Schwartz. [pdf] [bib]
Combining Open-Source with Research to Re-engineer a Hands-on Introductory NLP Course. 2008. In Proc. of the Third ACL Workshop on Issues in Teaching Computational Linguistics (TeachCL-08). Nitin Madnani and Bonnie Dorr. [pdf] [bib]
Multiple Alternative Sentence Compressions and Word-Pair Antonymy for Automatic Text Summarization and Recognizing Textual Entailment. 2008. In Proc. of the Text Analysis Conference (TAC). Saif Mohammad, Bonnie J. Dorr, Melissa Egan, Nitin Madnani, David Zajic, and Jimmy Lin. [pdf] [bib]
TERp: A System Description. 2008. In Proc. of the First NIST Metrics for Machine Translation Challenge (MetricsMATR). Matthew Snover, Nitin Madnani, Bonnie Dorr and Richard Schwartz. [pdf] [bib]
2007
Using Paraphrases for Parameter Tuning in Statistical Machine Translation. 2007. In Proc. WMT. Nitin Madnani, Necip Fazil Ayan, Philip Resnik, Bonnie Dorr. [pdf] [bib]
Measuring Variability in Sentence Ordering for News Summarization. 2007. In Proc. ENLG. Nitin Madnani, Rebecca Passonneau, John Conroy, Necip Fazil Ayan, Bonnie Dorr, Judith Klavans, Dianne O'Leary and Judith Schlesinger. [pdf] [bib]
Getting Started on Natural Language Processing with Python. 2007. ACM Crossroads, 13(4). Nitin Madnani. [Note: The PDF version has been completely revised since the official ACM version to keep up to date with the changes made to the software used in the article. ] [pdf] [html] [bib]
TREC 2007 ciQA Task: University of Maryland. In Proc. TREC. Nitin Madnani, Jimmy Lin, and Bonnie Dorr. [pdf] [bib]
Multiple Alternative Sentence Compressions for Automatic Text Summarization. 2007. In Proc. of the Document Understanding Conference (DUC) at HLT/NAACL. Nitin Madnani, David Zajic, Bonnie Dorr, Necip Fazil Ayan and Jimmy Lin. [pdf] [bib]
2005 and earlier
The Hiero Machine Translation System: Extensions, Evaluation, and Analysis. 2005. In Proc. HLT/EMNLP. David Chiang, Adam Lopez, Nitin Madnani, Christof Monz, Philip Resnik and Michael Subotin. [pdf] [bib]
Rapid Porting of DUSTer to Hindi. 2003. ACM Transactions on Asian Language Information Processing, 2(2). Bonnie J. Dorr, Necip Fazil Ayan, Nizar Habash, Nitin Madnani, and Rebecca Hwa. [pdf] [bib]
Python & Perl wrappers for SRILM: Wrappers that will allow you to read and query an SRI language model directly in your Python and Perl code. [link]
(Note: I also have working Python and Perl wrappers for the IRSTLM toolkit but I am too lazy to put them up here. Drop me a note if you are interested and I will send them to you.)
Interactive BLEU Scoring Tool: A visual and interactive environment for scoring output of automatic machine translation systems. It's written to run entirely in the browser and utilizes the latest web technologies to allow interactive qualitative examination of MT output. [link]
ParaQuery: An interactive querying tool for pivot-based paraphrase databases. Written entirely in Python. Work done in collaboration with Lili Kotlerman (my summer intern in 2012) and Aoife Cahill. [link] [pdf]
SciKit-Learn Laboratory (SKLL): SKLL (pronounced "skull") provides a number of utilities to make it simpler to run common scikit-learn experiments with pre-generated features. Work done in collaboration with Daniel Blanchard and Michael Heilman. [link]
python-zpar: A python wrapper around the ZPar English parser.[link]
node-zpar: A module that allows using the ZPar English parser with node.js [link]
WebSocket Stanford Tagger Server: This project provides a WebSocket server that wraps the Stanford Part-of-Speech tagger. This makes it easier to get part-of-speech tags from JavaScript for arbitrary text. Includes a jQuery demo. [link]
clusterinfo: A Python script that displays current usage of a PBS-based cluster in a more condensed and easier-to-read format. [tgz]
LM Server: A Python-based XML-RPC server for an SRILM language model. Allows multiple clients to query the same language model that's loaded in memory in server mode. [tgz]
UMIACS Word Alignment Interface: A Java-based tool for creating and viewing word alignments between language pairs. It has been widely used across the community to create aligments for many language pairs including Welsh-English, Swahili-English, Czech-English and Chinese-English. [link]
TER-plus (TERp): An automated evaluation metric for Machine Translation, comparing system outputs to reference translations. TERp utilizes automatically generated paraphrases, stemming, synonyms, relaxed shifting constraints and other improvements. [link]
(Note: collaboration with Matthew Snover, the main developer of TERp.)
An explanation of and thoughts about the Facebook PNAS paper on emotion contagion. 2014. [link]
Using Wikipedia Revisions for Automated Grammatical Error Correction. 2013. Invited CLUNCH talk at the University of Pennsylvania.
What Test Takers Say: Analyzing Argument Organization and Topical Trends in Essays. 2013. Invited talk for the Linguistics brown bag at Montclair State University.
A Story in Pictures. 2012. Entry for the Automated Student Assessment Prize Essay Visualization Contest organized by the William and Flora Hewlett Foundation. [pdf]
[Note: This entry won the first prize in the contest by popular vote. ]
Using Statistical Machine Translation to Improve Statistical Machine Translation. 2011. Invited talk for the Yahoo! Data Sciences Seminar at Rutgers University. [pdf]
The Circle of Meaning: From Translation to Paraphrasing and Back. 2010. Invited talk for the NLP Seminar at CUNY Graduate Center, New York.
A timeline of inter-annotator agreement measures in Computational Linguistics based on Inter-Coder Agreement for Computational Linguistics by Ron Artstein and Massimo Poesio. Linguistics seminar on Corpus-based Social Science, University of Maryland. [pdf]
Decoding in Statistical Machine Translation. 2006. StatMT Reading Group, University of Maryland. [slides]
Expectation Maximization. 2004. Advanced NLP Seminar, University of Maryland. [slides]