Using the Crime Data available on the website, you are tasked with writing a PYTHON program (or a set of Python programs if you prefer) which Reports, Analyses, and Visualises these datasets.

You are advised to work through the introductory Coursework Tutorial on Blackboard (entitled PYTHON CW Tutorial 1.docx) which will introduce you to the Crime Data (and how these data can be used in Microsoft Excel and Access).  By understanding the nature of the crime datasets, you will be able to formulate some ideas as to what might be worth investiating, analysing and visualising in Python).

The Coursework is open-ended, in that it is entirely up to you which datasets you use (e.g.1. South Wales Police; Gwent Police; Metropolitan Police; etc. – or multiple police force data); (e.g.2. August 2021; Summer 2021; 2020; 2018-2021; pre- and post-lockdown or combinations of these).  However, it is expected that you will use data from more than one month OR more than one Police Force.

The coursework is open-ended in terms of what you do with the data, but you will be assessed in terms of:

(a)  The Reporting of the Data Sets (for example, being able to give an overview of the data by month/season/year/Police Force; such as total crimes; break down of crimes; comparison of crimes; or other attributes such as location or outcomes); (Can you identify some interesting facts?).

(b)  The Analysis of the Data Sets (Extension of (a) above, but will consider further statistical reporting of the data, such as normalised results by total crime (% of all crime), or even population (crimes per 1,000 people).  You may need to source additional data to help you with this.  In addition, you may wish to test some hypotheses as part of your analysis, e.g. is crime increasing through time? Is burglary more prevalent in Summer or Winter? Does South Wales Police data correlate with other Police force data for the same time frame? What crimes increased or decreased during the Pandemic Lockdown?).

(c)  The Visualisation of the Data Sets (for example, using matplotlib to re-inforce the analysis you have undertaken; e.g. appropriate graphs or visualisation strategy appropriate to the message you wish to get across. As a specific example, the Pie Charts of the breakdown of crimes for August 2021 might be compared between South Wales and the Metropolitan Police).

(d)  Advanced Analysis and/or Visualisation. You may wish to explore for yourself some of the other capabilities of Python using the many freely available libraries/extensions.  This could be advanced statistical or numerical modelling; or the use of the basemap extension of matplotlib to produce some crime maps; or even crime heat or hotspot maps; or the development of a graphical user interface (e.g. TkInter) for your software.

(e)  Documentation of your work and annotation of your code.  All of your work must be fully documented in a Word or PDF file; and all of your code and datasets must be supplied in a folder so that they can be tested by your tutor.  For example, the documentation should focus on each aspect of your software which you wish to highlight, e.g. if your program tests a hypothesis, then clearly state what it is; how you went about testing and implementing this; and the results, including any graphs.  If you have used any additional libraries or extensions, then the documentation should clearly state this (and the source / implementation instructions), so that it can be re-created by your tutor.  Make sure that all datasets used for any code are also highlighted in the documentation. Or any pre-conditions as to the pathname for a file.  All of your code must be fully annotated, especially the “neat” or complex features.  The results of your code should also be fully presented in your documentation, in case the code cannot be executed.  High resolution graphical output should be used.  However, if you refer to output such as PDFs or animations, these can be included separately in your submission, but include the pathname to the output.

All of your work should be submitted via Blackboard as one compressed ZIP folder (not RAR or any other non-standard compression).  The ZIP folder should be given your enrolment number, e.g.

You will not be assessed specifically on the optimal quality of your code – just the ability to get an algorithm to do what you wanted it to do, whether it be 10 or 20 lines of code. However, the use of functions might help to simplify your code, e.g. a function to read a complete CSV dataset for any month or any police force and return the data as a series of lists.