Base Rate Calculation

Note: BWN base rate calculations are a work in progress and feedback is welcome on all parts of the BWN methodology.

For conferences in fields without established base rates, we use NIH Reporter to find the base rate of women in the field. To use NIH Reporter one must decide on some keywords that cover the conference in question.  Once that is done, there are two ways to calculate the base rate:

  1. Manually
    • Choose some of the results pages (e.g., if there are 20 pages, you can choose 5, 10, and 15) and count the names of men and women on the pages. They are alphabetically ordered so it is easy to see duplicates and count each person only once. For researchers whose name is not sex-identifiable, search for the person online and determine sex from photos. Information that is needed to report the base rates is: the keywords used for the search, the pages that were counted  (out of how many), and the resulting base rate.
  2. Using a python script (40 NIH page limit/day)
    • Export all projects matching the chosen keywords (using the option in the top right corner in NIH Reporter.) Make sure all fields are selected when exporting. Then use this python script that extracts names, and evaluates sex based on a probabilistic mapping of names to sex. The output is to a csv file, and includes names that the script’s API could not find a sex for — for these, search online as in (1), and tally up the results. Information that is needed to report the base rate is: the keywords used for the search, the completed csv / excel file (if you are running the script yourself), and the resulting base rate.