This page contains a database which was compiled with permission from public genealogical profiles uploaded to, a MyHeritage company.

Thanks to the commitment of MyHeritage to support scientific research, we were able to release redacted data that contains the basic family tree structures and demographic information. The data does not contain names, addresses, or other explicit identifiers of users.

The data is in the form of MySQL dump file and contains several tables. We also provide an API in Python to access the data.

Terms of Use

To download the data, you must agree to and accept the Terms of Use below.

    FamiLinx is provided to the scientific community BEFORE publication of the project. You are allowed to use the data, but in adherance to the Fort Lauderdale Principles. That means that you are not allowed to submit any publication based on this data prior to the acceptance of the project's paper, which will be notified on this website.
    FamiLinx is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
  • 2. PRIVACY:
    You agree not to use the dataset, either alone or or in concert with any other information, in any manner which may expose the identity of individuals. You agree not to use FamiLinx to promote hate, discrimination, or violence towards groups or individuals based on race, ethnicity, religion, gender, age, or family heritage or to use FamiLinx for any illegal activity. You further agree that any future work based on the FamiLinx database will be posted under the same privacy restriction above.
  • 3. GENERAL:
    You agree to report any violation of the Terms of Use to the authors of the study. FamiLinx and take no responsibility or liability for the accuracy of any content in the database.

Get the Data

We collect your contact information only to document the distribution of the data. We will not use this information to contact you for updates or offers or give it to any third party.

*Yes, I agree to the Terms of Use.


The FamiLinx database

These instructions have been tested on Ubuntu 10.04 with MySQL version 5.1.70

# extract the dump file (This should create a dump.sql file)
>> tar -xxvzf familinx.tar.gz
>> gunzip dump.sql.gz

# login to mysql shell
>> mysql -u your_username -p your_password

# create an empty database and pour the content from the dump file
mysql> USE familinx;
mysql> source dump.sql

# create this new user below and grant access to the database
# It is important to use this username to make sure that the Python interface is working properly
mysql> CREATE USER 'familinx'@'localhost' IDENTIFIED BY 'familinx';
mysql> GRANT ALL PRIVILEGES ON familinx.* To 'familinx'@'localhost' IDENTIFIED BY 'familinx';

# check integrity: type the following command and compare the results with the output here.
mysql> SELECT TABLE_NAME, table_rows, data_length, index_length, round(((data_length + index_length) / 1024 / 1024),2) "Size in MB" FROM information_schema.TABLES WHERE table_schema = "familinx";

# You should expect to see this table:
| TABLE_NAME   | table_rows | data_length | index_length | Size in MB |
| age          |    1039321 |     9353889 |     12528640 |      20.87 |
| founders     |   12080102 |   253682142 |    296000512 |     524.22 |
| gender       |   29780180 |   268021620 |    305615872 |     547.06 |
| location     |    5404864 |   157359560 |     64843776 |     211.91 |
| relationship |   51811932 |   673555116 |   1705807872 |    2269.14 |
| years        |    4351044 |    56563572 |     44653568 |      96.53 |

The Python interface

Make sure that the following dependencies are installed:

  • Python 2.6.5 or above
  • MySQLdb 1.2.2. or above (
  • NumPy 1.6.2 or above (
  • # Compile idcoeff
    >> cd ./pythonUI/idcoef/Idcoefs2.1.1
    >> make
    >> cd -

    # Run a test script to check the Python interface
    # Here, we instruct the interface to calculate the IBD between individual 1 and individual 2 in FamiLinx
    >> cd pythonUI
    >> python -i 1,2

    # Successful run will produce the following output (if it fails the program will report a detail error message):
    trying to run IBD for the pair: 1 ,  2
    pedigree file: ./tmp/test.1.2.ped
    sample file: ./tmp/
    output file: ./tmp/test.1.2.output
    ramsize = 2000
    npeop = 4
    Successful completion.
    successfully calculated the kinship scores for this pair:
    kinship =  0.5
    the 9 identity coefficients are:  [ 0. 0. 0. 0. 0. 0. 0.25 0.5 0.25]