All posts by Naomi Rachel Outlaw

A lover of the eclectic, following a journalistic pursuit at San Francisco State University. Always exploring what makes the world go round.

Data Mining: The New Gold Rush

Photo via Arbeck of WikiMedia Commons


Data and the insight it provides is power. Simply look at the rash of privacy breaches that struck the NSA, Target, iCloud, Samsung and the United States Postal Service to see what most organizations consider private. Data is growing exponentially, and now more than ever online users need to understand what happens to their data in order to avoid, as Dropbox CEO Drew Houston infamously put it, a “trade off between privacy and convenience.”

The Digital Universe is doubling in size every two years. By 2020 the amount of data will have increased from 4.4 trillion gigabytes in 2013, to 44 trillion gigabytes according to a 2014 study done by the International Data Corporation. In more human terms, today the average household creates enough data to fill 65 32gb iPhones per year. In 2020 this will increase to 318 iPhones, according to EMC – a corporation that offers data storage and analysis.

“The amount of data created in the past two years is more than the amount of data we’ve ever had… So there is a huge amount of data and a need for a way to sort through them,” says Hui Yang, an assistant professor in the Computer Science department at SF State.

The bulk of this data is metadata, or information generated when you use technology. It is everyday data collected from consumers’ activities and can contain information such as locations, IP addresses, web searches and other browser histories. By law, most metadata can be stored indefinitely and, through data mining – a field in computer science that analyzes the patterns and connections among data – can be used to classify anything from relationships between genes and diseases, to which internet users are more likely to buy a company’s product. Using this information for commercial purposes is where data mining gets a bad rep.

Although a currently relevant pop culture term, for decades “data mining” has played an intangible role in the growth and comprehension of the digital universe. It helps find patterns among vast amounts of data that human eyes cannot discover. And while data mining analyzes everything from medical data to business data to human rights, it is one of the tools used by data brokers – companies that collect, maintain, and sell data on millions of consumers generally without the consumer’s permission or knowledge.

The negative stigma that now surrounds any and all kinds of large data collection is a more recent development that is more apparent than the data being acquired, and can largely be attributed to the business built around selling people’s metadata.

According to last year’s report from the International Data Corporation, a market research and analysis firm, “In 2013, two-thirds of the digital universe bits were created or captured by consumers and workers, yet enterprises had liability or responsibility for 85% of the digital universe.”

Data brokers are among these enterprises.

Much of the personal information analyzed through data mining and collected by data brokers is demographic and transaction information about the user, the device, and the activities occurring in between. But credit card information, census data, and more public records are also included.

“This information makes clear that consumers going about their daily activities – from making purchases online and at brick-and-mortar stores, to using social media, to answering surveys to obtain coupons or prizes, to filing for a professional license – should expect that they are generating data that may well end up in the hands of data brokers… without their permission to construct detailed profiles on them reflecting judgments about their characteristics and predicted behaviors,” reads a 2014 Senate committee report.

Generally, analyzed metadata only aims to deduct codes and statistics like IP addresses, but when tracked across multiple platforms, the paper trail can become pretty direct.

Even then the Senate report goes on to say that, “Some privacy and information experts have expressed concerns that re-identification techniques may be used with such data, and questioned whether data that identifies specific computers and devices can truly be considered anonymous.”

Anonymous from who? When the Senate asked data brokers who buys their gathered information, companies across all platforms were named.

“12 of the top 15 credit card issuers; seven of the top 10 retail banks; eight of the top 10 telecom/media companies… three of the top 10 pharmaceutical manufacturers; five of the top 10 life/health insurance providers; nine of the top 10 property and casualty insurers,” reads the 2014 Senate report.

Some of the most known offenders: Yahoo, Twitter, Youtube, Google or DoubleClick, and AOL. But what’s surprising is the type of companies who buy and sell consumers metadata.

Just recently, the Associated Press reported the Affordable Care Act website, where Americans can sign up to receive health care, was sending users’ information to a number of third party companies.

So it seems that no matter how personal, some information is not private information, at least not to these companies. The lack of transparency about the amount and type of information gathered and analyzed is ultimately unknown to most users which makes opting out of having your data collected almost impossible.

But fear not, online user security is becoming more of an immediate concern. In February, President Obama announced new rules requiring intelligence analysts, like the NSA, to delete private information they may accidentally collect about Americans. The President also spoke at The White House Summit on Cybersecurity and Consumer Protection at Stanford University on February 13, discussing legislation intended to strengthen cybersecurity, an issue that he likened to “the wild wild west” according to the New York Times.

Since 2009 and continuing into 2014, the Unites States Federal Trade Commission has recommended that Congress develop legislation that allows consumers to view the information data brokers hold about them. One of the few online consumer rights laws is California’s “Shine the Light” law, which requires companies doing business with Californians to allow customers to opt out of information sharing, or disclose how personal information will be shared.

Hence obscure and needlessly long privacy terms and agreements being more relevant than ever.

“Data mining is relatively new and it’s affecting everyone but it does not have many laws. It’s like a free market,” Yang says. ”People definitely feel like they are being watched, but if you look at privacy and then what people post, (privacy) needs a lot of work.”

To some extent, the fear about data mining can be attributed to a general lack of knowledge and regulation, fueled by headlines about the NSA. On the other hand, users are actively creating and allowing the collection and analysis of their information.

Last quarter Facebook reported an average of 890 million active daily users. A 2012 survey done by Pew Research Center shows that, “More than half of social networking site users (58 percent) say their main profile is set to private.” That still leaves the data of 42 percent of social media users unprotected.

Data is constantly being created, but in the current age it has also come to mean more to not only users but to the companies who consume the data. Data has become a panopticon, a platform on which we create our own images and through which others see our constant updates.

Ultimately, it is up to the user to manage what information they put online. Data mining and other computer sciences can be used by consumers as both an advantage and a disadvantage.

When data is pooled about locations and transactions, business with the companies who analyze this data can be much more personalized. Take Google Now as an example. If you input information such as the location of your home or work, your favorite sports teams, your most frequently made food or grocery orders, or even your airplane tickets and Google Now will provide “relative suggestions” on routes to work, restaurants and events in your area, provide updates on your favorite team, and remind you of when your flight is and when you should leave to arrive on time.

In this setting, what can be considered private information can be sacrificed for convenient personalization.

On the other hand, organizations like Stop Data Mining provide “opt out lists” with links to the opt out pages of companies that collect data. Or for a simpler solution, almost all major browsers contain a “Do Not Track” preference. There are other options to remove or manually manage “cookies” that collect metadata, alternative browsers like DuckDuckGo, which doesn’t collect or share personal information, and of course privacy settings on social media.

As the amount of data continues to grow exponentially, there will be a need for more ways to organize and sort it. What’s data mining’s future?

“More of it. More people from more backgrounds becoming data scientists. More tools for data scientists. More schools teaching data science. More products built on data understanding. Oh, and robots.” says Todd Holloway, a data analyst at Trulia and an organizer of the San Francisco Data Mining Group that teaches how to effectively use data mining to say, for example, win at fantasy football.

Whichever way you bend, know the power of the data you put out and the transparency that it carries.


Downtown Salon Gives Men Free Haircuts

Lana Bowen, owner of Salana Hair Studio in Lower Nob Hill, styles Pedram Afshar’s new mohawk at the free haircuts event on Feb. 5th. Photo by Zhenya Sokolova

Teddy Hall, 18, didn’t know what to expect when he decided to RSVP for a free haircut advertised on, but as his shoulder length hair slowly began to fall toward the floor, he knew there was no going back.

This past Thursday was the first free haircut event of the new year at Salana Hair Studio. It was in conjunction with the Lower Polk/Tenderloin Art Walk that provides exposure to new hairstyles, artwork and diverse crowds.

On the first Thursday of the next three months, Salana will be offering free barber’s choice hairstyles to men. This month’s inspiration: mohawks.

“I do regular haircuts all day. Tonight we’re going for funky,” says Salana owner Lana Bowen.

Lana Bowen finishing up Teddy Hall's new mohawk at the free hair cuts event at Salana Hair Studio in Lower Nob Hill on Thursday, Feb. 5th, Photo by Zhenya Sokolova/ Xpress Magazine
Hall watches Bowen style his new mohawk. Photo by Zhenya Sokolova / Xpress Magazine

“I figured if the line was too long, I would go home,” said one the queued men as he laughed nervously. After seeing the first few mohawks, he decided he would wait until next month to get his free and hopefully more conservative haircut.

Although “barber’s choice” may sound like a disaster waiting to happen, the music, beers, art and laughter maintained a jovial atmosphere well into the night.

On top of scoring a free haircut, patrons have the opportunity to admire and purchase artwork featured in the salon. This month, Casey Castille’s intricate posters adorn the walls of Salana, which showcase the synergy of San Francisco’s creative community.

Cafes, bookstores and various galleries in the downtown area also showcase local artists during the monthly Lower Polk/Tenderloin Art Walk. Artists, art vultures, performers, business professionals, techies, students and those who just happen to saunter by the Art Walk locations all mingle with no particular expectation for the evening. Stories are shared, art is admired, and at the center of it all is the potpourri of people that define San Francisco.

This the the sixth year for the Art Walk. According to the event’s community manager Christine Villanueva, up to 1,300 people attend the shows.

“During the drier seasons, we have alleyway events where we promote local artists and craftsmen, accompanied by a live band and food trucks,” Villanueva says.

Michael Hussey reacting to his brand new hair cut, barbers choice by Lana Bowen, owner of the Salana Hair Studio in Lower Nob Hill on Thursday, Feb. 5th, Photo by Zhenya Sokolova/ Xpress Magazine
Michael Hussey reacting to his brand new haircut. Photo by Zhenya Sokolova / Xpress Magazine

March 3 from 7 to 9 p.m. is the next date for free haircuts, and it’s also Salana’s one-year anniversary. So guys, clear your schedules so you can keep your haircuts fresh and some extra money in your bank accounts.

Bob Simon: No Borders Uncrossed

Bob Simon, acclaimed CBS “60 Minutes” correspondent and a sentinel of ambitious journalism died Wednesday night in Manhattan when the taxi he was riding in slammed into metal lane dividers after rear ending a stopped car.

Simon, 73, was transported to St. Luke’s Roosevelt Hospital in New York City but suffered fatal head and chest injuries. The driver of the taxi sustained broken limbs but is in stable condition along with the driver of the other car. A law enforcement official told Wall Street Journal that the ongoing investigation hints that speed may have been a factor in the crash and no substance abuse was suspected. So far no one is in custody.

With a career in journalism spanning more than four decades, five wars, and sixty-seven countries, Simon constantly forged boundaries for war, crisis, and overseas reporting.

Reporting that earned more than 40 major awards, including 27 Emmys, four Peabody Awards, and more recently the Special President’s Lifetime Achievement award from the Overseas Press Club, his work and ambition chronicled historical events for thousands world wide.

A Bronx native, Simon began his 47-year career at CBS in 1967 after graduating Phi Beta Kappa from Brandeis University in 1962 with a degree in history. Here he covered campus unrest and inner city riots prior to being assigned overseas. From 1971 to 1977 he worked out of the London and Saigon bureaus before eventually moving back to the US where in 1987 he was named CBS News’ Chief Middle Eastern correspondent. During this time he covered everything from violence in Northern Ireland to war zones in Portugal, Cyprus, Yugoslavia, American military actions in Grenada, and notably the end of the Vietnam War where he boarded one of the last helicopters to leave, according to CBS.

Perhaps his most known story would be his coverage of the early days of the Persian Gulf War in 1991, where Simon along with three CBS News colleagues were imprisoned and tortured for 40 days, an experience he wrote about in his book, “Forty Days”. But even this experience didn’t soil his journalistic pursuits. He returned to Baghdad only two years later to cover the American bombing of Iraq and ever since then has covered major events like the Olympics and the Arab Spring.

Simon continuously sought to discover and cover events that left the world in limbo, wars that were intangible to most people’s daily lives, and brought worldwide issues into the homes of many through his 19 seasons as a 60 Minutes correspondent.

“Bob was a reporter’s reporter. He was driven by a natural curiosity that took him all over the world covering every kind of story imaginable,” said 60 Minutes Executive Producer Jeff Fager in a statement.
At the time of his death Simon and his daughter Tanya, a “60 Minutes” producer, were working on a story about the Ebola virus and possible cures to be featured on Sunday’s “60 Minutes” broadcast.
Simon is survived by his wife, Françoise, their daughter, Tanya, and his grandson Jack.