Text Mining Analysis for Public Policy
Course overview
This module will explore how to gather and mine texts for policy analysis and evaluation. Goods, services and even people are subject to continuous evaluation online, and the same is true for public policies. Today governments are increasingly using e-participation tools that allow citizens to express their opinions on specific programmes and to play an active role in decision-making. As a result, huge amounts of useful information are now available on the internet in text form.
We will explain how to scrape websites to retrieve this information and share different analysis techniques that will equip you to answer policy analysis and evaluation questions. Moreover, we will go through methods such as Optimal Character Recognition to extract digitalised text data from physical printed documents.
What does this course cover?
Week 1
The first week will provide an overview of coding in general. This module provides students with important hard digital skills, introducing them to one of the most used community-based software packages in policy analysis: R. Despite the differences between the languages of different software packages used in policy analysis (eg R, Stata, Python), some concepts are common to these languages, such as the “if” function, loops and so on. The main functions students learn will be pave the way to learn other languages, such as Python, SQL and other more complicated languages.
Week 2
The second class will focus on data gathering: how data may be gathered and imported from the internet. This will introduce policymakers to web harvesting techniques in R. We will first discuss legal, ethical and technical considerations for scraping data on the internet. We will then study the structure of HTML pages and see the main features that are useful when doing web scraping. Finally, we will go through the main R packages and functions for web scraping and we will apply them to hands-on case studies.
Week 3
In the final week we will show how R can be used to visualise and analyse text as data. We will start with text pre-processing and some descriptive text features such as length, N-grams and so on. We will them cover the basic supervised and unsupervised approaches to text analysis: sentiment analysis, scaling techniques, topic modelling etc. We will finally go through the R packages and functions for these techniques and apply them to hands-on case studies.
What will I achieve?
Learning outcomes of this module are the following:
- To demonstrate a sophisticated grasp of concepts of coding, especially the application of coding in R for the use of complex policymaking
- To demonstrate the critical ability to undertake web scraping and text analysis in R
- To acquire the skills to critically assess web data in the context of writing policy proposals
- To have an advanced understanding of the application of data processes to the context of problem-solving in a critical way
- To learn basic concept of coding (which in future can be applied to any coding language)
- To learn specific software packages, such as R
Who will I learn with?
Senior Lecturer in Public Policy
Who is this for?
This short course is for mid-career professionals. Standard entry requirements are a 2:1 degree plus 3 years of relevant work experience. Applicants without a 2:1 or higher degree are welcome to apply and typically require 5+ years of relevant work experience.
How will I be assessed?
One written assignment, plus participation in webinars and discussion forums.
Our modules offer high levels of interaction with regular points of assessment and feedback. Each four week module is worth five Master's level academic credits and includes three webinars with a King's lecturer and peer group of global professionals.
What is the teaching schedule?
Format: Fully online, plus 3 x 1-hour weekly webinars
This module has been designed specifically for an online audience. It uses a range of interactive activities to support learning including discussion forums, online readings, interactive lectures videos and online tutorials.