Last Modified: January 8, 2017
CS 295: Statistical NLP: Winter 2017
Project Pitch
Sameer Singh
http://sameersingh.org/courses/statnlp/wi17/
One of the biggest indicators of whether a course project will be enjoyable and successful is how early do the
students start working on it. Thus, I want to encourage you to start forming groups, discuss project ideas, and get
feedback from me as early in the quarter as possible. To facilitate this, you will be submitting a short pitch of your
project. Each project pitch (one per group) will be a
one page
PDF write-up uploaded to Canvas by
January 23,
2017, containing the following sections.
1 What to Submit?
1.1 Team Details
Your pitch should contain the following details about the group:
1.
Team Name: A short, memorable name for your group. If you are struggling with this, just find an interesting
word that uses all your initials, or simpler still, just pick a random food item :)
2.
Members: Names and IDs of all the group members. Add yourself to a “group” on Canvas (
People>Project Groups
).
If there four members, justify. For < 3 members, take appointment to meet on January 17, 2017.
3.
Division of Labor: In a single sentence for each member, describe the relevant background, and the part of
the project are they likely to contribute to.
4. Diversity: Add a single sentence to describe what you think makes your group diverse.
1.2 Problem Setup and Motivation
In a paragraph or so, describe the problem that your contribution will be addressing. This should include, at the
very least, what the formal problem set up is (what is the input and the output?) and why is it important (potential
applications?). A brief understanding of related work should be included to support your project, such as the
similar tasks that have been studied in the literature so far.
1.3 Proposed Approach
In few sentences, describe what strategy you envision taking to solve the problem you proposed above. I do not
need a detailed solution, but instead a very high-level understanding of how you intend to bridge the gap between
the input and the expected outputs. Of course, this is not set in stone, and I expect it to change considerably as
you learn more about NLP and your project, however this section is where you will argue why it is an achievable
goal for the quarter.
1.4 Evaluation Plan
Describe, in a paragraph, how you intend to evaluate whether your project was successful. Although you may
include a brief description of the qualitative results (such as examples, important features, etc.), you should focus
on empirical evaluation. What is the dataset you will evaluate on? If you do not have any, how will you collect
it? What is the evaluation metric? What is the baseline you definitely want to beat, or alternatively, what is the
performance you would be happy with (e.g. > 80% accuracy)?
1.5 Instructor Appointment
Take an appointment with me to discuss your project pitch (using Canvas), and mention the date and time you
have selected.
Project Pitch UC Irvine 1/ 2
CS 295: Statistical NLP Winter 2017
2 Tips and Suggestions
Here are some suggestions as you think about your project ideas and groups:
Play to your strengths. I want you to work on something you like doing, and have some expertise in, as much
as possible. If you are a machine learner, think about what NLP tasks can your methods be applied to. If you
have some problem you care about, propose a novel task or dataset for others to collaborate on. If you have
some ongoing research that is relevant, identify a small, independent research question, and propose it.
Skim papers. Go through the list of papers on the course website (I will be adding more papers soon). Read
their abstracts, and see if any of them get you excited. Browse titles on recent NLP and ML conferences
(ICML, NIPS, AAAI, ACL, NAACL, EMNLP, and KDD), see if you can find their datasets or codebases. For
even more “niche” topics, look at conference workshops to see if you like something, e.g.
http://naacl.
org/naacl-hlt-2016/workshops.html.
Use Github. This may be obvious to many of you, but needs to be said. Learn
git
, and use Github to share
as much as possible with your group: reports, code, data, documentation, etc. You may also want to consider
using other features like issues, website (Github Pages), and the wiki. I also encourage you to make your
repositories public for open source access (maybe use a license like Apache), but you are free to keep it
private if you like (Github provides free private repos). Feel free to include the repo URL in the pitch if you
decide to make it public, or want to share the private repo with me (Github username: sameersingh ).
Use Piazza. Piazza can be a useful place to find classmates based on expertise and brainstorm different ideas.
For example, if you have an idea but not a team, propose it on Piazza to see if someone else is interested.
You can also advertise yourself (“Can do neural stuff!”) and see if someone needs you. I encourage you to
continue using Piazza even after the teams are finalized, such as to compare your results to others working
on the same/similar data, or to solicit feedback/suggestions from others on software libraries.
Start work early. I really cannot stress this enough.
Project Pitch UC Irvine 2/ 2