Download bigdata ethics presentation and more Slides Computer Science in PDF only on Docsity!
Big Data Analytics
Big Data and Ethics
Jesse Eickholt, PhD
Ethics
“An ethical analysis [of … ] would be
well served by starting with the
concepts of ethics. However …”
Timmermans et al., The Ethics of Cloud Computing.
…this would require a description of a
philosophical foundation from which
concepts of morality can be
developed.
A large financial institution wants a “big
data” approach to determining a who would
be a “risky” customer (i.e., who should be
charged a higher interest rate).
A bright, up-and-coming employee decides
to scrape social media posts for all current
employees. This information is combined
with the companies historical payment
data.
Big Data AnalyticsBig Data Analytics
Example
Maybe it works great! Identifies a “risky” customer 80% of the time. o (^) Facebook posts o (^) Tweets o (^) Comments posted on websites Extract common words (e.g., positive and negative), common phrases used (e.g., my boss is a ….), movies and music tastes.. Payment history (e.g., on time, late, etc.) Machine learning Classify as: Risky / Safe Use to make predictio ns on new, potential customer s.
A health insurance company wants to have a better idea of the health of its customers. Clients with additional “risk factors” will be charged more or possibly denied coverage. The company monitors social media posts and looks for mentions of restaurants visited and looks a purchases on Amazon. Wait! How did they get that data?
A crude heuristic is developed to look
for certain combinations of fast food
restaurants and certain sizes of clothing
(e.g., many fast food restaurants and
large clothing sizes indicates a risk).
Again, what is wrong with this
approach?
The department store would also like
to know when a customer visits the
store but does not make a purchase.
The store offers free WiFi and
customers can connect via their phone
which contains a unique identifier
(MAC address).
What? No way! How? Ok. But how to connect to a customer’s name.
MAC addresses are tied to a customer name
by two routes…
i) cross checking credit card/reward
purchases with the presences of a particular
MAC address [if they both coincide more
than 5 times could it be a coincidence?]
ii) customers can download and use an in
store app which provides coupons through
discounts.
To collect even more data the
department store logs what websites
(or IP addresses) are accessed through
the store’s WiFi.
To summarize … the department store
knows…
- what products a customer buys
- when the customer visits the store (even if
nothing is bought)
- where the customer goes in the store and
which stores the customer has visited
- what is viewed through the store’s WiFi
- details that the store’s app was able to pull
from the phone
Why do this?
If you ask a company what data they
are collecting, they may or may not
tell you and they may or may not give
it to you.
How might this type of data be used
against you? Who might want to buy
it?
What is the Driving Force?
Not to be cynical, but profitability is
key here!
There is value in that data!