Week 1: Brainstorm the scope of project
From Jason Chiau:
How do human find unknown notations?
I will find the nearest clickable link and click on it hoping it can take me one step closer to the answer. By doing this recursively, I can find my answer most of the time.
What if I solely let computer imitate this process?
I can traversal Wikipedia, make a subset with the notation that I don't understand and then do a PageRank or DFS.
What is the problem in doing this?
How can I generate core explanation from the page I find.
What if the target is not textual searchable?
What if the same notation can refer to completely different things?
If so, how can I take context into consideration?
If the search term is a math formula, what is the programmable representation of it? latex? image?
How to deal with different descriptions of the same problem? NLP -> search -> NLG?
If we are going to do the search, is it pre-indexed or dynamic?
From Jenny:
I think we can start from parsing latex files since it's pretty formatted. There are two sourses of math symbols/equations:
symbols that are defined within the file: we could do a lookup?
symbols that are convention: may need to search on Wikipedia -> PageRank (like Jason mentioned)
In terms of the length of equation we are targeting, I think we could start with single symbols first. Because NLG for math equation may not be as straight forward as CSS description in the paper.
I did literature review, but it seems there is not any math equation explanation software yet. However, I did find some math parsers (more like math calculator) online:
http://mathparser.org/mxparser-tutorial/using-internal-help/
http://www.antlr.org/ (this one is mentioned in the paper, I think it's pretty relevant, just put it here)
These only supports a fix set of operations, so I assume they have a pre-built classes and functions. But we can learn about how they interpret the math equations.
From Lysia:
What could be leveraged from Tutoron?
Routine
- Detection of math equations/notation
- Parse detected region
- Explain by traversing the parse structure
Design guideline
Reappropriate existing tool/parsers like mentioned by Jenny
Inspect large-scale math equations in the field ahead of time to mine common usage
Deliverable
- web app
Scope
Where to start? Papers? Wikipedia page?
latex? pdf?
Either papers or Wikipedia page, we should pick a field, e.g Neural Network related paper/wiki pages
Explanation of a simple math notation or explanation of an entire equation?
- e.g in Tutoron, it achieved both: explain each argument for wget, and also explain the purpose of the entire line of code
Final deliverable?
- web plug in? chrome extension? pdf plug in? similar to Tutoron?
Potential user
- People from academia, e.g grad students, professors, researchers, etc
Potential technical skills involved
How to build a parse tree? How to generate simple languages?
Search