In this tutorial we want to give the proper tools to work around gender based violence (GBV) data. We will be working with a database from a criminal court in Buenos Aires that has an open gender data policy. We will be showing the attendees how to work with this data from a social and ethical perspective, how to work with Python when we have categorical data.
In this tutorial we want to give the proper tools to work around gender based violence (GBV) data. We will be working with a database from a criminal court in Buenos Aires that has an open gender data policy. The court personnel upload the anonymized database to the cloud every week with the new resolutions that the judge dictates. The Criminal Court n°10 of Ciudad Autónoma de Buenos Aires is the first criminal court to open their data with a gender perspective.
The Gender Data Observatory and the Women in Bioinformatics and Data Science Latam Network got together to assemble a tutorial to show people how to work with gender based violence data from a social and ethical perspective. In this tutorial we will show how to work with Python when we have categorical data, how to analyze and visualize the categorical data and some basic conclusions that we can arrive at using numpy and pandas library.
In this tutorial we will also have one of the members of the Criminal Court to tell us about the experience using open data for the public good.
Regarding the method of data collection, the Criminal Court trained their staff to upload the sentences to a Google Sheet in Drive, the data is anonymized to preserve the identity of the women that are reporting the violence. Througout the years, the collection and publication process was refined but it's not yet automatized. You can acces the data through this link: https://drive.google.com/drive/folders/0B9wNhp3GjjazcmNTYzE4Rk1VQUU?resourcekey=0-392G3qH5cBis1dNPHZGo7g
About the bias: we know that the sample we have is rather small to gather conclusions about gender based violence in general. The Criminal Court is one of 31 courts in the city of Buenos Aires, and the data they collect comes from a few neighbourhoods of the city, it's not representative of all the city or all the country. 99% of the reports are made from cis women. Trans women, trans men and non binary persons are not represented in this dataset. Another bias that we detected is that the judge, Pablo Casas is a judge that actively fights for women rights, so the sentences that he dictates are not the most common ones around judges in our country.
However, this experience it helps to understand a little bit better about gender base violence in the intrafamiliar home, most of the reports come from women reporting their partners or ex partners, in many cases there are children involved in the relationship. Also, we are highlightin the importance of working with open government data, making an example of this court in particular, aiming for other courts to open their data as well.
When we work with this kind of data we take the time to explain the conext of creation of the dataset, all the variables are explained in the notebook, and we make the biases explicit in our analysis.