|
Student |
Mentor |
|
Phyo Thiha Swarthmore College, PA. |
William M. Pottenger, PhD Associate Research Professor Computer Science and DIMACS Rutgers University |
|
DIMACS Summer 2008 REU |
|
Project Description |
|
Coming soon… If you can’t wait, please check the links to the presentation slides provided below ;-) |
|
Log |
|
|
Action |
Start |
End |
|
1. |
Reading for the Literature Search ¨ “The Power of Word Clusters for Text Classification” (Slonim & Tishby) ¨ “A Novel Bayesian Classifier for Sparse Data” (Ganiz & Pottenger) ¨ “Mining Higher-Order Association Rules from Distributed Named Entity Databases” (Li, Janneck, et al.) ¨ Ian H. Witten and Eibe Frank (2005) "Data Mining: Practical machine learning tools and techniques", 2nd Edition, Morgan Kaufmann, San Francisco, 2005. |
Week 1 Week 1 Week 1
Week 1 |
Week 1 Week1 Week 2
Present |
|
2. |
Reading & trying to understand the code “IBA_1.0: Information Bottleneck Clustering (2003)” provided at URL: http://www.princeton.edu/~nslonim/ |
Week 2 |
Week2 |
|
3. |
Write Summaries for each paper mentioned above and for Chapters 1,2,3 and 4 from “Data Mining: Practical machine learning tools and techniques” |
Week 1,2,3,4 |
Week 1,2,3,4 respectively |
|
4. |
– Presentation 1: Introduction to my project and plans
– Contacted Slonim and trying to tackle his paper about word clusters in more detail |
Tuesday, June 17 |
|
|
5. |
– Changed focus to HONB modification; read the code Murat wrote; re-read the paper about HONB mentioned above – Assigned run experiments for the best results of SMO on Weka 3 – Also assigned to skim through documentations about software design and learn about HONB software architecture; asked to stop after we switched focus on running SMO tests |
Week 4
Week 4 Week 4 |
Week5 Week 4 |
|
6. |
– Wrote a summary of the findings on SMO results and handed them to Professor Pottenger – Re-read Chapter 5 and some part of Chapter 6 from “Data Mining: Practical machine learning tools and techniques” – Started studying the code to filter higher order paths in HONB |
Early Week 5
Week 5
End of Week 5 |
Early Week6
|
|
7. |
– Got reply from Murat and used the snippet of code provided to modify HONB for getting pure higher order path; decided to give up and change course – Run experiments for pure/filtered HONB – Wrote a summary of the findings and handed in a report to professor |
Early Week 6
Week 6 End of Week 6 |
End of Week 6 |
|
8. |
– Re-read Chapter 5 and 6 of the textbook; start brainstorming for the potential project ideas – Prepare for REU Final Presentations – Assigned a task to correct the frequency of occurrences in HONB code – Final Presentation Done; Update the webpage |
End of Week 6
Early Week 7 Mid Week 7 Thursday, July 17
|
Week 7
|
|
Resources & Links |
|
1. URL: http://www.princeton.edu/~nslonim/ Þ Noam Slonim’sWebpage. A good place to see his work (related to IB and word clustering) and retrieve papers written by him. 2. URL: http://citeseer.ist.psu.edu/ Þ Cite Seer.IST. To download scientific research papers from this awesome digital library. 5. 6. 7.
|