Important Notes

To get all the single nominations, more than 16,000 of them, I crawled the AMPAS database, which is my source for this piece. This Oscar dataset is pretty massive and didn’t come straight out as it is, in a machine readable and clean format. So you might want to read the notes and methodology section for information about how the dataset was built and structured. I’m open for criticism, debate and questions, so drop me a line at alice [at] silk.co (I also still have conflicting thoughts about whether to look at absolute or weighted percentages: please share your thoughts on this! I went for weighted percentages, but, for transparency, I kept the absolute number counts in the data as well, so you can edit all the charts to plot these instead).

Note on weight: Each nomination can have more than one nominee (For example: Scientific And Engineering Award of 2014 went “to EMMANUEL PRÉVINAIRE, JAN SPERLING, ETIENNE BRANDT and TONY POSTIAU for their development of the Flying-Cam SARAH 3.0 system.”). To distinguish nominations with multiple nominees from those with single nominees, I weighted each nominee by the number of other nominees that shared the award with him/her. So, for example, the datapoints referring to EMMANUEL PRÉVINAIRE, JAN SPERLING, ETIENNE BRANDT and TONY POSTIAU each have a “Nomination Weight” of 0.25.

When calculating the % of female and male nominees per year or award category, I made both a calculation using the raw numbers (for example: # female nominees out of the total nominees) and a weighted one (for example: sum of “nomination weight” of female nominees, divided by total sum of “nomination weight” for male and female nominees).

On this note: obviously, only people were counted when calculating % of female and male nominees. Therefore, if a nomination has, for example, as nominees: MGM, Marilyn Monroe and James Dean, each of the three will have a “nomination weight” of 0.33. However only two were added up to calculate male/female ratios (total: 0.66).

Stories with data, from the data collection (or scrape) to the data visualization. Data storytelling instructor. Currently project leader at batjo.eu