Hi guys,
I'm working on a project to build a unified, aggregated football database by combining data from the FIFA series, Transfermarkt, PES, and Football Manager. The ultimate goal is to create a next-gen mod—similar in spirit to CEP (Career Expansion Patch)—by integrating rich external data like real-life transfers, attributes, and more.
Once the base database is solid, it could potentially serve as a universal football modding resource for career mode editors, editors of other football games, or even custom games.
What I've done so far:
1. Extracted Player Data from FIFA Series (FIFA 2005 onwards)
2. Matched Players to Transfermarkt
Issues Encountered:
1. FIFA 07 UCL Birthdate Bug:
Birthdates are off by 2 days—possibly a time zone or data bug.
2. Special Characters & Hyphenation:
Arabic/Korean names often exclude hyphens in FIFA (e.g., Min Jae Kim vs Min-Jae Kim on Transfermarkt).
3. Name Variants (Hypocorisms):
Cases like Andy vs Andrew, Don vs Donald. A reference list of nickname mappings would help improve match accuracy.
4. Common Names vs Full Names:
Searching Roberto Carlos Da Silva Junior won’t return Roberto Carlos. Need to include commonname and jerseyname fields in the search logic.
5. Women Players
Transfermarkt doesn’t list female players, so they’ll need to be filtered out.
6. Squad File Issues (RDBM):
When opening Squad files using RDBM, many player names appear blank or missing. If anyone has a working method to extract those cleanly, I’d love to try it.
CURRENT DATASET (WIP)
PES
Any suggestion or help would be great.
I'm working on a project to build a unified, aggregated football database by combining data from the FIFA series, Transfermarkt, PES, and Football Manager. The ultimate goal is to create a next-gen mod—similar in spirit to CEP (Career Expansion Patch)—by integrating rich external data like real-life transfers, attributes, and more.
Once the base database is solid, it could potentially serve as a universal football modding resource for career mode editors, editors of other football games, or even custom games.
What I've done so far:
1. Extracted Player Data from FIFA Series (FIFA 2005 onwards)
- Downloaded @Skoczek ’s FIFA DB collection.
- Used DB Master 08 and DB Master 15 to extract all players from each version.
- Collected: firstname, surname, commonname, jerseyname, birthdate, first_appearance, and last_appearance.
2. Matched Players to Transfermarkt
- Built a script to search each player on Transfermarkt using "firstname + surname" query.
- Verified matches by checking birthdate accuracy.
Issues Encountered:
1. FIFA 07 UCL Birthdate Bug:
Birthdates are off by 2 days—possibly a time zone or data bug.
2. Special Characters & Hyphenation:
Arabic/Korean names often exclude hyphens in FIFA (e.g., Min Jae Kim vs Min-Jae Kim on Transfermarkt).


3. Name Variants (Hypocorisms):
Cases like Andy vs Andrew, Don vs Donald. A reference list of nickname mappings would help improve match accuracy.
4. Common Names vs Full Names:
Searching Roberto Carlos Da Silva Junior won’t return Roberto Carlos. Need to include commonname and jerseyname fields in the search logic.
5. Women Players
Transfermarkt doesn’t list female players, so they’ll need to be filtered out.
6. Squad File Issues (RDBM):
When opening Squad files using RDBM, many player names appear blank or missing. If anyone has a working method to extract those cleanly, I’d love to try it.
CURRENT DATASET (WIP)
PES
- Found a good player dataset from EvoWeb (last updated last year), but it lacks birthdates, which complicates exact matches.
- Open to any ideas or alternate PES datasets.
- FMDB is promising but I’m cautious about making 100,000+ web requests—don’t want to trigger anti-bot or DDoS protections.
- If there’s a better or more efficient way to scrape or source data (e.g., open dumps, APIs), please let me know.
Any suggestion or help would be great.