HTC Education Series: Getting Started with Deep Learning

This week's lesson is a little bit different from the normal formula. The fastai lecture part of the HTC lesson will be taught by Rachel Thomas rather then Jeremy. She will be discussing the ethical implications of deep learning systems (and AI systems in general).

As a designer of such systems, you need to make yourself aware of potential consequences of the system, what impact(s) it might have on the world, for good, or for evil.

What do you need to be thinking about as a designer of such systems to avoid potentially huge issues developing after they are deployed into the world at large?

Are there design strategies you can follow that help identify potential problems and squash them before they become huge problems?

You can also watch this lecture at the fastai course site here. The advantage of this is that you can look at course notes, a Questionnaire, a transcript of the lecture is available to train your next generative RNN system, etc.

What is covered in this lecture

What is ethical behavior?

Why should engineers or programmers care?

Do different cultures have different ethical frameworks? And if so how can we respect and work with them in our own cultural ethics?

What was the role of IBM in Nazi Germany? Were they implicit in the holocaust by providing the technology to help run it effectively? What if you had been an engineer working there at the time? What would you have done?

Systems that can potentially impact people lives (must be evaluated very seriously).

EX: Credit scores, insurance claim validation, court sentencing, missile targeting, job hiring evaluation.

the system needs to be able to tell someone why they made a decision

the system needs to be correctable if it is making mistakes

the system needs to be open because of the above reasons

Bias in datasets used to train deep learning systems directly affects their behavior

system only as good as the data used to train it

this could severely impact people's lives being monitored or evaluated by these systems.

EX: auto companies evaluating effect of crashes on dummies of humans

women more likely to be injured because mostly male dummies were used in the studies. This bias test data pattern has been seen in drug testing, clothing design, etc.

EX: AI policing systems can just lead to amplify cultural and racial biases already built into the institution.

Youtube's recommendation engine is discussed in detail. Former engineer warned management when he worked there about his serious misgivings and concerns about the recommendation system and was not appreciated for that. He pointed out that the AI recommendation algorithm ended up promoting dangerous conspiracy theories and mis-information in the world at large. The Guardian coverered this in depth in a series of articles.

Rachel points out that this is an example of an AI system where the designers made an incorrect assumption about the data they were modeling. That the data was static. But the system influences what it is supposed to monitor, creates a feedback loop, leads to amplification of behavior, changes the data the system monitors. This can be dangerous when what is being amplified is chaotic and destabilizing information or behavior in the world at large.

Facebook ended up promoting and participating ethic genocide in Myanmar by becoming a platform there that spread hate speech, dangerous and incorrect conspiracy theories.

Rachel points out that they put less than 20 speakers of the native language to try and monitor their promotion of ethnic genocide after years of this being pointed out to them. But when the EU threatened to fine them for promoting hate speech, they immediately hired 1200 people specifically to deal with the issue in a very short period of time.

Makkula Center has 5 ethical lenses you to use to evaluate systems

Fastai has an analysis system you can use to evaluate a project

Additional HTC Course Material

1: In the spirit of this lesson we will first look at a lecture by Xander that will get us pumped up about Generative Adversarial Networks (GAN). Especially the StyleGAN architecture. This particular approach to training and building deep learning neural networks can lead to some fascinating and extremely powerful capabilities.

And the whole sub-genera of deep learning associated with GANs is truly mind boggling. And we've barely even begun to develop the full potential of this technology. So Xander is excited, I'm excited, everyone is excited about GANs.

But like all technologies, they can potentially be used by people with malicious intent. Or perhaps even more frighteningly by algorithms with malicious intent. So now we will look at just one example of malicious use of GAN technology.

So here's a link to an article in 'The Verge' this week that discusses a very malicious use of GAN technology inside of 'deep fake' algorithms being used in AI Bots that target and then try to blackmail individuals with artificially generated porn imagery.

So we are excited about GAN technology. We know it can do amazing things. But it can also be used for harm. Should we not develop it?

I think we should (continue to develop it), but we need to be aware of potential misuses and educate people, avoid making it too easy to misuse. Maybe internal watermarks can help prevent misuse?

This is kind of like asking if PhotoShop should have been banned in the late 80s when it first came out because it could be used to create artificial pornographic imagery (it could, right? it was, right?). One can argue that the potential good uses of PhotoShop greatly out weighted the potential bad uses. I can make the same argument about GAN technology.

We are going to cover GANs in all of their details in later lessons of the HTC course. This seemed like a good time to introduce some of the ideas (an controversies) associated with them.

2: I wanted to put together an entry portal into the programming part of this course. To help people get going. It's always hard to get started with any new programming system even if you are a professional programmer. And if you have never really done any programming before, you won't have cognitive dissonance associate with old ways of doing things, but you will need to grasp the thought patterns associated with how programs are built and how they work.

We will provide links to these additional programming resources here as they come online later this week.

3: There is a really great lecture that Jeremy gave on the design of the fastai api at Stanford this last February. I've included it here below in the additional HTC specific material for this lesson because i think now is a really great time to step back and reflect on what fastai is all about. And i wanted everyone taking the HTC course to not lose any technical momentum, so since the fastai part of lesson5 is light on coding, we'll make up for that here.

I just finished the Part 2 fastai lectures for 2019 (Part 2 2020 has not been made yet). And while watching Part 2 2019, i was thinking about whether the course was a deep learning course, or a course on coding. Because i thought i was taking a deep learning course, and on first pass the Part 2 lectures felt like a was really learning about how to program, and not really about deep learning, or not enough maybe?

But as i've reflected on it more, i came to a different realization. That what that Part 2 course had really been about was how to build a deep learning system. So yeah it's a course on coding, but it's a course specifically focused on how to code a deep learning system.

And not only do they build a deep learning system, they do it twice, in 2 different languages to really rive the point home that the design is the key, not the language it is implemented in. First they build the existing fastai system in Python (on top of PyTorch).

But then they build the whole fastai system again from scratch in a second language (Swift, but on top of TensorFlow).

And the way they code it is so elegant. And they do it in the context of building a deep learning system. So it's eminently practical to the task of implementing deep learning. But the principals and design patterns they are using are really an amazing education. An education in how to program.

So this lecture Jeremy gave in February provides a brilliant top down overview of what all of those elegant design decisions are. It also provides a lot of interesting use cases, different fastai students who used the api in very innovative ways, and who also helped develop or flesh out the need for some of the design decisions along. You can watch it below.

Observations

1: Personally i find it really great that fastai feels that ethics of new technology and how it can/could be used is important enough to include in a course on a particular technology (deep learning). Because this never would have been the case when i went to undergrad and graduate school in engineering.

If it had been discussed at all, it would have been some obscure elective course in some social studies department. Absurd to include it in an engineering curriculum. The engineering curriculum was pretty macho back in those days.

So it is nice to see that the times they are a changing. Because the certainly need to.

Rachael mentions how deep learning systems deployed into the world can act to change characteristics of the world at large. Due to feedback effects.

The best way to think about feedback is to think about a microphone hooked up to a PA system or guitar amp, and then if you move the microphone too close to the speaker any super small noise will be amplified over and over again until suddenly it creates feedback that totally swamps the entire system. It's loud and obnoxious (now are we talking about Twitter or microphone feedback?).

YouTube's automatic recommendation system is a classic example of the feedback effect at work. In really bad ways sometimes. Because they can tend to promote false conspiracy narratives to society at large because they lead to more repeat view and share clicks.

2: Becoming more aware of insufficient knowledge of important issues like diversity, or bias in one's self, and in the organization one builds, or one works at, is very important. Bias in data sets, which leads to baked in bias in engineering systems constructed using those biased datasets, can create all kinds of problems when these bias trained systems are deployed.

Bias can affect how we may subconsciously build our perceptions of different people. Which can lead to serious consequences for these people if we use our biased perceptions of them to influence decisions that affect their life and well being, the life and well being of their families, of the communities they live in. 'We' can mean you personally, or the group of people you work with or hang out with, or the business you work for.

I thought it would be good to provide some personal observations to help drive these points home. So i'll be candid about some personal flaws, or candid about institutional flaws some organizations i worked for may have had.

2A: So the first pass at this post forgot to include the 'What is covered in this lecture' section that is in all of the HTC lesson posts. I could probably invent all kinds of different rationalizations to avoid personal responsibility, but some tech-macho part of my personality thought this material was too light weight to need the detailed break it all down to make sure you learned everything covered topic analysis we normally do for each lecture.

2B: What i was noticing in my mind as i've gotten more active in the latest Deep Learning scene is how much more diverse (the people participating in the scene) seemed to be. One example that struck me in the last few days was that the 'important enough to be included in the HTC coverage of the 2020 AI Musical Creativity course happening this last week were all lectures presented by women. This wasn't some decision i made where i was only going to pick women on purpose to insure diversity, it just happened because what i thought were really great topics that needed to be in the HTC discussion were presented by women.

At the same time, i'm taking a really great Stanford GAN course that is taught by Sharon Zhou, some of the HTC bootcamp lectures we're using in the HTC Deep Learning course are really great lectures taught by Ava Soleimany.

So i was talking to a friend of mine yesterday about how refreshing it was that the AI field was actually really diverse now (for a change, not my normal experience in tech in a lifetime of doing tech), certainly not the case at all in the 90s when i was actively involved in neural net research.

So when Rachel starting getting into the diversity section in the 2nd half of todays fastai lecture on AI Ethics, she put up a slide stating that that only 12% of the people currently practicing AI are women (to demonstrate how bad diversity was). So obviously a big eye opener for me, since i had just been expounding a few hours earlier about how many women were in the field now, and how refreshing that was. 12% is obviously better than 1%, but it needs a lot of improvement to reflect actual population diversity.

Since i'm making this very personaly, i will point out that the ratio of graduating women to men in my undergraduate electrical engineering degree program was .0625%.

So 0.625% of my graduating BEE class were women. Pretty bad.

2C: Rachel also mentioned in the diversity section of her talk that just getting women into the tech field via some tech degree program is not enough. A huge problem is that many women end up bailing out of the program. Either before they get their degree, or after a few years of dealing with the realities of being a woman in the tech industry (professional and academic).

Rachel mentions bias in how women's research or even resumes are viewed can lead to a lack of being taken seriously, mis judgment of qualifications. Which is highly unfair, and must be unbelievably frustrating for people subjected to it.

I have certainly seen this very bad dynamic play out at organizations i have worked out. In the late 90s i briefly worked for a very prestigious small research lab associated with a large multi-national company in the bay area. This research lab (think tank, what ever you want to call it) had weekly seminar talks, a usual feature at this kind of tech organization. And i watched a sick and disgusting dynamic play out over time whenever women researchers came in to give technical talks to the group. They were subjected to an intense form of sexist technical bullying.

There were people in this lab that were more than willing to rip you a new intellectual asshole if they sensed any weakness in what you were saying in your presentation. So they were equal opportunity intellectual bullies in that respect, if they thought you had a dumb idea, or were bullshitting them, they would rip you apart.

But the problem was that they had obvious pre-conceived biases against women, and would oftentimes move in for the slaughter with these female job candidates or visiting scholars just because they were female, not because they had a bad idea or didn't understand what they were talking about. I found it so highly offensive, it was a huge factor in my decision to stop working there.

3: One way we're trying to address diversity at HTC involves rethinking how to teach this material in a different way that will make it hopefully more accessible and useable to artists without a heavy technical background. Our hope is that this alternative future 'Deep Learning for Artists' course will help a more diverse set of individuals get excited about and build skills in deep learning and other cutting edge technology that can be used to address their particular areas of artistic interest.

Don't forget to read the course book

Need to review something from the previous lessons in the course.
No problem.

You can access Lesson 1 here.

You can access Lesson 2 here.

You can access Lesson 3 here.

You can access Lesson 4 here.

You can move on to the next Lesson 6 in the course when it posts on 11/02/20.

Comments

SynthetikOctober 28, 2020 at 4:49 PM
Andrew Ng including the following in his most recent The Batch email from deep learning.ai. The content seemed appropriate to include in this course lesson, since it relates to some of the ethical concerns brought up in the lecture.

AI Spreads Disinformation
Will AI promote lies that deepen social divisions?
The fear: Propagandists will bait online recommendation algorithms with sensationalized falsehoods. People who snap at the clickbait will be reeled into opposing ideological silos.
Behind the worries: Consumption of online content has skyrocketed since the pandemic began. Social media platforms, especially, are known to be vectors for disinformation. Bad actors have embraced algorithmically boosted disinformation campaigns to advance their agendas.
This year alone, agitators have exploited these systems to widen political divisions, spread false data about Covid-19, and promote irrational prejudices.
Russian operatives have been blamed for spreading misinformation on a vast range of topics since at least 2014, when the Kremlin flooded the internet with conspiracy theories about the shooting down of a Malaysian passenger jet over Ukraine. That campaign helped to cast doubt on official conclusions that Russian forces had destroyed the plane.
YouTube’s recommendation engine is primarily responsible for the growing number of people who believe that Earth is a flat disc rather than a sphere, a 2019 study found.
How scared should you be: Social media networks are getting better at spotting and blocking coordinated disinformation campaigns. But they’re still playing cat-and-mouse with propagandists.
Earlier this month, researchers found that Facebook users could slip previously flagged posts past the automated content moderation system by making simple alterations like changing the background color.
Creators of social media bots are using portraits created by generative adversarial networks to make automated accounts look like they belong to human users.
Efforts to control disinformation occasionally backfire. Conservative media outlets in the U.S. accused Twitter of left-wing bias after it removed a tweet by President Trump that contained falsehoods about coronavirus.
What to do: No company can tell fact from fiction definitively among the infinite shades of gray. AI-driven recommendation algorithms, which generally optimize for engagement, can be designed to limit the spread of disinformation. The industry is badly in need of transparent processes designed to reach reasonable decisions that most people can get behind (like free elections in a democracy). Meanwhile, we can all be more vigilant for signs of disinformation in our feeds.

Search This Blog

Haiku Tech Center

HTC Education Series: Getting Started with Deep Learning - Lesson 5

Comments

Post a Comment

Popular posts from this blog

CycleGAN: a GAN architecture for learning unpaired image to image transformations

Pix2Pix: a GAN architecture for image to image transformation

Smart Fabrics