Advertisement

Harvard Undergrad Publishes Anonymized Student Data, Alleges Datamatch Security Flaw

{shortcode-7b84dbf34af347abdaa34801766bd6cff2b47137}

A website created on Sunday by Harvard undergraduate Sungjoo Yoon ’27 exposed user data of Harvard College freshmen who had registered for Datamatch, a student-run online matchmaking service.

Yoon’s website, titled “the data privacy project,” published a list of freshmen’s Rice Purity Test scores, which users could optionally enter into their Datamatch profiles. On Yoon’s website, each score was listed alongside a set of student initials.

Created to allow undergraduates at Rice University to “track the maturation of their experiences throughout college,” the Rice Purity Test asks 100 questions about sexual experiences, drug use, and other illicit activity. Students check off whether they have engaged in each activity, and the quiz spits out a score corresponding to the number of items checked off.

Harvard undergraduates took to Sidechat — a social media app that allows users to publish posts anonymously — to react to the website’s launch, with some voicing concerns about data privacy while others found it funny.

Advertisement

Yoon, who created the project under the pseudonym “bernie marx,” wrote on the website that he published the data as part of a “case study” meant to warn fellow students to “not keep putting your info into random apps.”

Datamatch, which was founded by Harvard students and serves over 30 colleges, asks participants to fill out a profile on the website before an algorithm produces 10 potential matches on Valentine’s Day. Students can then receive funding for a free meal with their top matches.

Yoon said in an interview with The Crimson Sunday evening that none of the information he published was identifiable to specific students and represented only “1/10 of 1%” of the data available to him and that it will be deleted after one week.

Yoon said he took security measures to ensure student privacy on his website and processed the data in way that was not visible to him.

He wrote in a description of his website that the list was encrypted with “a secure key.” If a student’s initials were rare or unique within the class, Yoon said he redacted their last initial and included only the first letter of their first name.

Yoon said he learned Datamatch users’ photos were accessible through a link anyone could type into their browser, which led him — as well as some others, including his brother — to look further into Datamatch’s data security.

According to Yoon, Datamatch does not encrypt most user data. He also said that when a user searches for or matches with another user, the other user’s data is sent to their device.

“We found literally all the private/personal user input data they store — from rice purity score to gender identity to location on campus. for all of the hundreds of thousands of people across all the dozens of schools that participated in datamatch,” Yoon wrote.

Yoon wrote on the website that “anyone with 10 seconds can thus pull this sensitive/vulnerable user data from their personal device.”

Yoon also claimed that Datamatch’s algorithm “discriminates against ethnic names” because it can’t accommodate diacritics, such as accents over letters.

In an emailed statement to The Crimson Sunday evening, Datamatch co-president Nadine Han ’25 wrote that Datamatch was “investigating this security issue and are taking measures to lock access to the attributes mentioned by the report.”

“We can guarantee that all profiles have been locked since approximately 9:30pm so that users can only view their information and no one else can,” Han said. “As of right this moment, all rice purity score information is deleted.”

In a section about data privacy on their website, Datamatch assures users that their data will be viewed and handled sparingly.

“The Datamatch team personally touches your information only as much as is necessary to develop the Algorithm™ and resolve user issues,” the website reads. “We may collect some anonymous stats like usage statistics, but your name and contact info will be completely separate from such reports.”

In a statement addressed to “harvard peers” on his website, Yoon warned students about the danger of big data and urged them to take data privacy seriously.

“We live in a society where dubiously-ethical governments and less-than-ethical corporations are looking to capitalize on this useful tool,” Yoon wrote, listing examples of the U.S. and Chinese governments buying data from brokers.

Yoon went on to warn students of the “fourth industrial revolution,” writing that “aggregate computing capacity is growing at an exponential rate.”

In his statement, Yoon also expressed his surprise that students had willingly given their data to Datamatch.

“It shocks me how many of u were willing to input sensitive data into things like claim and datamatch, even right here on campus,” Yoon wrote.

Yoon said he chose to release Rice Purity Scores to draw attention to the issue without causing users harm.

“I just picked it because I thought it would gain the most attention without being damaging to other people’s futures,” Yoon said.

“If someone else had thought about this before I did, they could have done really terrible things with this data,” he said.

In the future, Yoon said he hopes to see Datamatch devote more resources to data security.

“I think they need to fix this,” Yoon said. “They absolutely need to fix this.”

—Staff writer Jo B. Lemann can be reached at jo.lemann@thecrimson.com. Follow her on X @Jo_Lemann.

Tags

Advertisement