Automated Philanthropy Scoring Framework

This project develops tools to semi-automate a specific style of philanthropic evaluation. The current system has two distinct components for processing different data:

The Form 990 code uses Excel VBA to score IRS Form 990s submitted by 501(c)(3) organizations. The scoring highlights entities that have endowments, award scholarships, and emphasize science education or research. This framework is at an early stage of development, and future versions are expected to be implemented in Python with relational database support and a web-based interface.

The Scholarship Directory information is scraped from the Labor Department website using Python and associated tools.

Form 990 Scorer

Getting Started

To get the system running:

  1. Download Code.xlsm and place it in your working directory (e.g., x/).
  2. Download the .txt files:
  3. Download IRS Form 990 XML files from IRS Form 990 Series Downloads.
  4. Unzip those forms into x/testforms/, and create subdirectories:

Prerequisites

Installation Notes

nodenames.txt

Each line defines:

Date;10;Return/ReturnHeader/TaxPeriodBeginDt  
Integer;4;Return/ReturnHeader/TaxYr  
AbsInt;15;Return/ReturnData/IRS990/CYInvestmentIncomeAmt  
String;600;Return/ReturnData/IRS990/ActivityOrMissionDesc  

stopwords.txt and punctuation.txt

Used to clean and tokenize text fields—feel free to modify.

rule.txt

Defines scoring logic for each rule. Users can modify or add rules.

Parsed & Scored Worksheets

Rule Types

There are four rule types. Each uses a semicolon-delimited format:

1. Substring

Substring;RuleName;Nodename;Present;token1,token2,...

2. Trend

Trend;RuleName;Nodename1,Nodename2,...

3. Percentile

Percentile;RuleName;Nodename;Cutoff

4. Eval

Eval;RuleName;Nodename;NumOrTxt;Expression

Sample Rules from rule.txt:

Eval;Age;IRS990_FormationYr;Num;Year(Now()) - IRS990_FormationYr > 15  
Substring;Web;IRS990_WebsiteAddressTxt;T;academy,edu  
Percentile;EndYrBal;CYEndwmtFundGrp_EndYearBalanceAmt;0.50  
Trend;YrNet;IRS990_NetAssetsOrFundBalancesBOYAmt,IRS990_NetAssetsOrFundBalancesEOYAmt  

Running the System

  1. Move Files
    In VBA module move990, run Move990Files

    Moves Form 990 files to /990/, skips Form 990EZ and others

  2. Parse XMLs
    In module Parse, run ParseXML990Files

    Extracts nodename data into Parsed990Data

  3. Clean Text
    In module Strip, run Master

    Cleans descriptions and web addresses; populates DescFiltered

  4. Score Data
    In module Score, run Score

    Evaluates rules and outputs to Scored990Data

Scraping Scholarship Directory

Prerequisites

Activate ChromeDriver.exe

If you don't have these Python extensions, then run:

Download scraper.py from GitHub repository and place it in your working directory.

Output

The output goes to a csv file with 8 columns and as many rows as scholarships. The 8 columns are labeled

  1. ID
  2. Award Name
  3. Organization
  4. Purpose
  5. Level of Study
  6. Award Type
  7. Award Amount
  8. Deadline

In the output uploaded to GitHub, the csv file (called scholarships.csv) has 10,000 rows which was the entirety of what was available from the Labor Department's CareerOneStop website on July 30, 2025.

Authors & Contributions

License

This project is licensed under the GNU General Public License v3.0.