COMING SOON
Team members
Amarjyot Kaur Narula (ISTD), Joey Yeo Kailing (ISTD), Philia Neo Tong Wee (ESD), Tan Jing Kang (ESD), Tan Zi Ning (ESD), Wu Rong (ESD), Xie Han Keong (ISTD)
Instructors:
Writing Instructors:
Teaching Assistant:
\n%20%20%20%20 \n%20%20%20%20
01
The analysis of content from local and international news outlets and social media platforms through sensemaking is important for generating insights in the homeland security landscape. They form part of a repertoire of data sources to help MHA officers make informed decisions to keep Singapore safe and secure. The current sensemaking process is time-consuming and tedious because the entire process is done manually. This manual process does not optimise man-hours, monetary resources, and opportunities for utilizing the data. To design a sensemaking tool in the form of an integrated news feed and analytics dashboard. Our tool will allow MHA officers to browse online materials more efficiently and perform analysis and generate insights more effectively, saving time and increasing the quality of insights generated.
Project Description
Description
Mission
\n%20%20%20%20 \n%20%20%20%20
02
The former image illustrates our overall architecture and data flow, from the scraped raw news articles, into our Natural Language Processing (NLP) tasks, which are stored in the database. And a frontend user interface displaying the results. The latter one illustrates the data flow for our NLP tasks, from the raw news articles into desired outputs, which are meant to partially assist the officers in sensemaking.
Architecture
Overall Architecture
NLP Tasks Flow
\n%20%20%20%20 \n%20%20%20%20
03 \nEach%20NLP%20tasks%20has%20their%20desired%20outputs%20as%20written%20below%20respectively.%20We%20have%20explored%20multiple%20algorithms%20for%20each%20and%20based%20on%20the%20evaluations%20conducted,%20have%20finalised%20on%20the%20most%20suitable%20ones.
Each NLP tasks has their desired outputs as written below respectively. We have explored multiple algorithms for each and based on the evaluations conducted, have finalised on the most suitable ones. \n%20%20%20%20 \n%20%20%20%20
04 \n✨ Homepage has visualisation components<br>\n- Meant to assist in exploring the data with ease<br>\n- Visualisations provide a high level idea of the data for the analysts to familiarise themselves with the big picture<br><br>\n✨ Clean pages displaying articles in each topic and in each event <br>\n- Provides more granular details such as entities that support the analysts in understanding each individual event<br><br>\n✨ Article page that has convenient features such as copy to clipboard options<br>\n- Meant for easy sharing between colleagues or for note taking\n</p>\n </div>\n</body>","type":"apostrophe-html","__docId":"ckr0e2abi00nlzip9zrf0umum","__dotPath":"placeholder_full.items.3.areaRight.items.2"}' data-options='{"edit":false}' >
✨ Homepage has visualisation components 🎉 We once again wish to extend our gratitude towards our industry HTX mentors -- Sylvia Liaw, Dr Terence Tan, Lim Ming En, Martyn Wong and Ong Pang Wei. We thank you for the continuous support, striving us to do better. Industry Partner
Backend
Frameworks
\n\n%20%20Frameworks
NLP%20Tasks
\n
\n  1.%20Clustering
\n🎯 Obtaining news articles that belong to the same event<br>\n✅ Latent Dirichlet Allocation (LDA)<br><br>\n\n  <b>2. Summariser</b><br>\n🎯 Single document (Long and Short) Summary for every news article<br>\n✅ Bidirectional and Auto-Regressive Transformer (BART) & TextRank<br>\n🎯 Multiple document Summary for every event <br>\n✅ SummPip<br><br>\n\n  <b>3. Entity Extractor</b><br>\n🎯 Extracting entities (eg, number of killed, weapons used) for every news article<br>\n✅ Information Extractor (IE) pipeline -- Bidirectional Encoder Representations from Transformers (BERT) finetuned on (Stanford Question Answering Dataset) SQuAD2.0<br><br>\n\n  <b>4. Topic Classification</b><br>\n🎯 Categorising news articles into labels (eg, terror attack, disease)<br>\n✅ Single Binary Classifier<br>\n\n</p>\n\n\n </div>\n</body>","type":"apostrophe-html","__docId":"ckr0e2abi00nlzip9zrf0umum","__dotPath":"placeholder_full.items.2.areaRight.items.0"}' data-options='{"edit":false}' >NLP Tasks
1. Clustering
🎯 Obtaining news articles that belong to the same event
✅ Latent Dirichlet Allocation (LDA)
2. Summariser
🎯 Single document (Long and Short) Summary for every news article
✅ Bidirectional and Auto-Regressive Transformer (BART) & TextRank
🎯 Multiple document Summary for every event
✅ SummPip
3. Entity Extractor
🎯 Extracting entities (eg, number of killed, weapons used) for every news article
✅ Information Extractor (IE) pipeline -- Bidirectional Encoder Representations from Transformers (BERT) finetuned on (Stanford Question Answering Dataset) SQuAD2.0
4. Topic Classification
🎯 Categorising news articles into labels (eg, terror attack, disease)
✅ Single Binary Classifier
Frontend
Designing%20to%20Simplify%20the%20Users'%20Journey
\n\n%20%20Designing to Simplify the Users' Journey
User%20Interface%20Features
\nUser Interface Features
- Meant to assist in exploring the data with ease
- Visualisations provide a high level idea of the data for the analysts to familiarise themselves with the big picture
✨ Clean pages displaying articles in each topic and in each event
- Provides more granular details such as entities that support the analysts in understanding each individual event
✨ Article page that has convenient features such as copy to clipboard options
- Meant for easy sharing between colleagues or for note taking
Amarjyot Kaur Narula
Information Systems Technology and Design
Joey Yeo Kailing
Information Systems Technology and Design
Philia Neo Tong Wee
Engineering Systems and Design
Tan Jing Kang
Engineering Systems and Design
Tan Zi Ning
Engineering Systems and Design
Wu Rong
Engineering Systems and Design
Xie Han Keong
Information Systems Technology and Design
© 2021 SUTD