Sameer Singh

Univ of Washington
Seattle WA
...
Postdoctoral Researcher
2013 - current

Sameer Singh
CSE 480
Computer Science & Engineering
University of Washington
185 Stevens Way
Seattle, WA 98195
sameer@cs.washington.edu

Sameer Singh is a Postdoctoral Researcher (Research Associate) in the Computer Science and Engineering, University of Washington, working with Carlos Guestrin, Luke Zettlemoyer, Dan Weld, and worked briefly with the late Ben Taskar. His research focuses on large-scale and interactive machine learning applied to information extraction and natural language processing, along with practical probabilistic programming languages such as Factorie and Wolfe.

He finished his PhD in Computer Science at University of Massachusetts, Amherst in 2014 under the supervision of Andrew McCallum, as part of the Information Extraction and Synthesis Lab (IESL). Sameer's dissertation focused on scalable approximate inference and discriminative training techniques for large probabilistic graphical models. During his PhD, he also interned at Microsoft Research (Cambridge), Google Research, and Yahoo! Labs.

He was selected as a DARPA Riser in 2015, won the grand prize in the Yelp dataset challenge in 2015, awarded the UW CSE Postdoc Research Grant in 2014 and 2015, Yahoo! Key Scientific Challenges Fellowship in 2010 (umass story, yahoo story), Accomplishments in Search and Mining award by Yahoo! in 2010, the UMass Graduate School Fellowship in 2009, and was a finalist for the Facebook PhD Fellowship in 2012. Sameer is one of the founding organizers of the popular NIPS Big Learning and the ICML Inferning workshops, and is also the organizer for the Automated Knowledge-Base Construction (AKBC) workshops in 2013, 2014, and 2016.

External Links

Selected Recent Publications see all...

  • M. Tulio Ribeiro, S. Singh, C. Guestrin. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. Knowledge Discovery and Data Mining (KDD). 2016Conference
    Also presented at the CHI 2016 Workshop on Human-Centred Machine Learning (HCML).
    Coming Soon!
    @inproceedings{lime:kdd16,
     author = {Marco Tulio Ribeiro and Sameer Singh and Carlos Guestrin},
     title = { "Why Should I Trust You?": Explaining the Predictions of Any Classifier },
     booktitle = {Knowledge Discovery and Data Mining (KDD)},
     year = {2016}
    }
  • T. Rocktaschel, S. Singh, S. Riedel. Injecting Logical Background Knowledge into Embeddings for Relation Extraction. Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). 2015Conference
    Matrix factorization approaches to relation extraction provide several attractive features: they support distant supervision, handle open schemas, and leverage unlabeled data. Unfortunately, these methods share a shortcoming with all other distantly supervised approaches: they cannot learn to extract target relations without existing data in the knowledge base, and likewise, these models are inaccurate for relations with sparse data. Rule-based extractors, on the other hand, can be easily extended to novel relations and improved for existing but inaccurate relations, through first-order formulae that capture auxiliary domain knowledge. However, usually a large set of such formulae is necessary to achieve generalization.
    In this paper, we introduce a paradigm for learning low-dimensional embeddings of entity-pairs and relations that combine the advantages of matrix factorization with first-order logic domain knowledge. We introduce simple approaches for estimating such embeddings, as well as a novel training algorithm to jointly optimize over factual and first-order logic information. Our results show that this method is able to learn accurate extractors with little or no distant supervision alignments, while at the same time generalizing to textual patterns that do not appear in the formulae.
    @inproceedings{logicmf:naacl15,
     author = {Tim Rocktaschel and Sameer Singh and Sebastian Riedel},
     title = {Injecting Logical Background Knowledge into Embeddings for Relation Extraction},
     booktitle = {Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)},
     year = {2015}
    }
  • T. Chen, S. Singh, B. Taskar, C. Guestrin. Efficient Second-Order Gradient Boosting for Conditional Random Fields. International Conference on Artificial Intelligence and Statistics (AISTATS). 2015Conference
    Coming Soon!
    @inproceedings{gbcrf:aistats15,
     author = {Tianqi Chen and Sameer Singh and Ben Taskar and Carlos Guestrin},
     title = {Efficient Second-Order Gradient Boosting for Conditional Random Fields},
     booktitle = {International Conference on Artificial Intelligence and Statistics (AISTATS)},
     year = {2015}
    }
  • X. Ling, S. Singh, D. Weld. Design Challenges for Entity Linking. Transactions of the Association for Computational Linguistics (TACL). 2015Journal
    To be presented at ACL, Beijing, July 26-31, 2015.
    Recent research on entity linking (EL) has introduced a plethora of promising techniques, ranging from deep neural networks to joint inference. But despite numerous papers there is surprisingly little understanding of the state of the art in EL. We attack this confusion by analyzing differences between several versions of the EL problem and presenting a simple yet effective, modular, unsupervised system, called Vinculum, for entity linking. We conduct an extensive evaluation on nine data sets, comparing Vinculum with two state-of-the-art systems, and elucidate key aspects of the system that include mention extraction, candidate generation, entity type prediction, entity coreference, and coherence.
    @misc{el:tacl15,
     author = {Xiao Ling and Sameer Singh and Dan Weld},
     title = {Design Challenges for Entity Linking},
     series = {Transactions of the Association for Computational Linguistics (TACL)},
     year = {2015}
    }