Skip to main content Skip to secondary navigation

Leveraging AI for legislative text classification

Main content start
Photos of the USA Facts senior project team as they give their presentation on Zoom

Team members

Kasha Akrami, Marlies Michielssen, Andrew Tang, Brooke Tran

The organization

USAFacts is a non-profit organization and website that provides data and reports on the US population, its government's finances, and government's impact on society. USAFacts aims to make facts easily accessible so that public discourse and understanding can be grounded in facts.

USA Facts logo

Project description

USAFacts classifies legislative actions by category, to illustrate where the priorities of congress lie. Currently, this classification is a manual process, which is time- and labor-intensive. The goal of this senior project is to produce an automated tool capable of processing and classifying legislative text documents into their appropriate topical category (e.g., Immigration, Social Welfare, Environmental Protection). Relative to the current process of manual text classification, we hope our tool can save a significant number of working hours and increase the scope and scale of government documents included in their analyses. From a broader perspective, an automated tool for this process will enable USAFacts to generate a more complete and accessible analysis of legislative actions and provide better insights into societal problems for the general public.

Solutions, methods, and models used

We developed several models capable of processing text and categorizing legislative actions, including multi-class logistic regression with tf-idf scores, Hierarchical Attention Network, and fastText models. Ultimately we recommend the fastText model since it achieved the highest accuracy rate and the shortest training time.

All 2021 senior projects