Parvez Kose
Pre-Thesis site
- Yann LeCun, Director of AI Research, Facebook, Director, NYU Center for Data Science, Silver Professor of Computer Science, Neural Science, and Electrical and Computer Engineering
- Enrico Bertini, Associate Professor, Computer Science and Engineering, NYU School of Engineering
- Anna Choromanska, Associate Professor, Computer Science and Engineering, NYU School of Engineering
- Kyunghyun Cho, Assistant Professor, CILVR Group, Computer Science (Courant Institute) and Center for Data Science, New York University
- Daniel Shiffman, Associate Arts Professor, Interactive Telecommunications Program at NYU’s Tisch School of the Arts
- Tega Brain, Industry Assistant Professor, NYU School of Engineering
- Iddo Drori, Professor, Columbia University, Dept. of Computer Science, NYU Center for Data Science
- Dana Karwas, Industry Assistant Professor, NYU School of Engineering
- Ann Borray (Messinger), Manager, ViDA Lab, NYU School of Engineering
Alberto Maria Chieric is an entreprenur & PhD Student at Computer Science and Engineering Department, NYU Tandon School of Engineering.
Joby Jose is a Master’s Student majoring in Computer Science at NYU Tandon School of Engineering
The concept for the prototype is to build a client side artificial neural network, which can help deploy and run deep learning models on the web. We developed an image classifier that recognizes objects in the images and runs entirely in the browser, using Javascript and high-level layers API.
We use TensorFlow’s Javascript library: TensorFlow.js, a WebGL-accelerated, browser-based JavaScript library for training and deploying machine learning models on web It allows training neural networks on the browser or run pre-trained models in inference mode.
This demo uses the pre-trained MobileNet_25_224 model from Keras here.. Keras is an open source neural network library written in Python. It is designed to enable fast experimentation with deep neural networks.
It is not trained to recognize human faces. For best performance, upload images of objects like piano, coffee mugs, bottles, etc.
I am profoundly excited by the idea of designing and interpreting the artificial neural networks. With the growing success of neural networks and deep learning, there is an increasing need to be able to interpret them and understand their design and inner workings of them.
What better tools would help interpret the deep neural networks and other machine learning algorithms.
Deep learning, has proved very powerful at solving problems in recent years, and it has been widely deployed for tasks like image captioning, voice recognition, and language translation. There is now hope that the same techniques will be able to diagnose deadly diseases, make trading decisions, and do countless other things to transform whole industries.
Without a clear understanding of how and why a model works, the development of these models typically relies on a time-consuming trial-and-error process. As a result, academic researchers and industrial practitioners are facing challenges that demand more transparent and explainable systems for better understanding and analyzing machine learning models, especially their inner working mechanisms.
I strongly believe it is a problem that is already relevant, and it’s going to be much more relevant in the future. Whether it’s an investment decision, a medical decision, or maybe a military decision, we don’t want to just rely on a ‘black box’ method. It’s time to act on making their decisions understandable, before the technology is even more pervasive.
I am interested in the idea of an interactive model that could help understand, diagnose, and refine a machine learning model with the help of interactive visualization.
The idea of explaining how AI technique work using an interactive model and aided with rich visualizations will not only be interesting and useful, but also discerning from the current tools and methods that involves significant mathematical notations and statistical knowledge.
I am most excited by the idea of a rich design space for interacting with the neural network. This would further help us explore their untapped potentials. It promises to be a powerful tool in enabling meaningful human oversight and in building a fair, safe, and aligned AI systems.
Link to Dialogic Journal on Google Drive
Olah, Chris, et al.The Building Blocks of Interpretability., Distill 2018, distill.pub/2018/building-blocks/.
Keywords: Intelligibility; explanations; explainable artificial intelligence interpretable machine learning
With the growing success of neural networks and deep learning, there is an increasing need to be able to explain how neural networks make their decisions — there is a growing need to explain how they make their decisions, including building confidence about how they behave in the real-world, detecting model bias, and for scientific curiosity.
This paper discusses how we can develop tools to better interpret neural network models. This article categorizes those techniques into (1) feature visualization (2) feature attribution (3) feature grouping and shows that these can be integrated into an interactive interface as layers to let users get a sense of the internals of an image network. The article first motivates the reason to understand the network through canonical examples and visualization of concepts such as activation of neurons.
The next section talks about performing feature visualization at every spatial location across all channels. The visualizations are visually appealing, and really give a sense of how the model comes to its decision. I’m very excited to see such visualizations applied to other image datasets. Then it discuss how different concepts at different layers influence and are influenced by concepts in other layers. This is the first visualization of its kind that I’ve seen and seems like it could be extremely valuable in helping determining many details..
Beyond interfaces for analyzing model behavior, we can consider interfaces for taking action on neural networks, where the training models can learn from human feedback. Human feedback on the model’s decision-making process, facilitated by interpretability interfaces, could be a powerful solution to these problems. It might allow us to train models not just to make the right decisions, but to make them for the right reasons.
This suggests that we need more fundamental research, at the intersection of machine learning and human-computer interaction, to resolve some of these interepretability issues. Given these model behaviors are complex, author remarks that an important direction for future interpretability research will be developing techniques that achieve broader coverage of model behavior. Some other possibilities are interfaces for comparing multiple models. For instance, we might want to see how a model evolves during training.
Smilkov, Daniel, et al.Embedding Projector: Interactive Visualization and Interpretation of Embeddings., ARXIV, 11/2016, arXiv:1611.05469 [stat.ML]
Keywords: Statistics - Machine Learning, Computer Science - Human-Computer Interaction
An embedding is a map from the data to points in Euclidean space. Embeddings are everywhere in machine learning, appearing in recommender systems, Natural Language Processing, and many other applications. Researchers and developers often need to explore the properties of a specific embedding, and one way to analyze embeddings is to visualize them.
The paper presents Embedding Projector, a new visualization tool that helps users interpret machine learning models that rely on embeddings. Unlike other high-dimensional visualization systems, it is customized for the kinds of tasks faced by machine learning developers and researchers: exploring local neighborhoods for individual points, analyzing global geometry, and investigating semantically meaningful vectors in embedding space.
It can be seen that there are a number of directions for future work on visualization. For example, when developing multiple versions of a model, or inspecting how a model changes over time, it could be useful to visually compare two embeddings. The paper highlights that a second direction for future research could be make it easier for users to discover meaningful directions in the data. While the current interface makes it easy to explore various hypotheses, there may be ways for the computer to generate and test hypotheses automatically.
Wongsuphasawat, K. (. 1. )., et al.Visualizing Dataflow Graphs of Deep Learning Models in TensorFlow., IEEE Transactions on Visualization and Computer Graphics, vol. 24 doi:10.1109/TVCG.2017.2744878.
Keywords: Clustered Graph; Dataflow Graph; Graph Visualization; Index Terms-Neural Network
Deep learning models are becoming increasingly important in many applications. Understanding and debugging these models, however, remains a major issue. This paper describes a design study of a visualization tool that tackles one aspect of this challenge: interactive exploration of the dataflow architecture behind machine learning models.
This tool helps users understand complex machine learning architectures by visualizing their underlying dataflow graphs. The tool works by applying a series of graph transformations that enable standard layout techniques to produce a legible interactive diagram.
The design process emphasizes understanding of both users and data: It describe a task analysis for developers of deep learning models, and outline the challenges presented by model structures. The author then present a series of graph transformation to address these challenges, and demonstrate usage scenarios. They also discuss user re- actions.
Users (researchers, engineers and designers) find the visualizer useful for understanding, debugging, and sharing the structures of their models. This shows that users derived significant value from the visualizations, which is a welcome sign for visualization creators
Author concludes by saying developer reactions suggest a heartfelt desire for better ways to understand machine learning. This is an area in which data is central, but the tools have not matured, and users often feel they operate in the dark. Visualization may have an important role to play.
Olah, C.,Feature Visualization [link], Distill, 2017 DOI: 10.23915/distill.00007
Deep Learning, Neural Network, GoogLeNet Visualizations, Interactive Visualizations
Neural networks are algorithms inspired by the biological brain. Deep learning is a set of techniques for learning in neural networks. They have made a huge impact on a wide variety of applications recently. They provide some of best solutions to many problem in image recognition, voice recognition and autonomous driving.
While deep neural networks learn efficient and powerful representations, they are often considered a ‘black-box’. With their growing influence in our lives, there is a critical need to better understand their decisions and to gain insight into how these models operate. This paper attempts to interpret a neural network model for an image classification task.
Feature visualization is one of the important techniques to understand what neural networks have learned from image dataset. This paper focuses on optimization methods and discusses major issues and explore common approaches to solving them.
The author presents a comprehensive overview on feature visualization, mainly focusing on optimization method. It starts with explaining why use optimization method to visualize feature in neural networks compared to finding examples from the dataset. Then discuss how to achieve diversity with optimization. Then further discuss the interaction between neurons, which can explore the combinations of neurons working together to represent images in neural networks. Finally, discuss how to improve the optimization process better by adding different regularizations methods.
The author concludes by saying feature visualization stands out as one of the most promising and developed research directions. By itself, feature visualization might never give a completely satisfactory understanding. But as one of the fundamental building blocks when, combined with additional tools, will empower humans to understand these systems.
Vinyals, O., et al.Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning Challenge., IEEE Transactions on Pattern Analysis and Machine Intelligence, no. 4, 2017.
Image captioning, recurrent neural network, sequence-to-sequence, language model.
This paper presents a machine learning model, which is trained to describe the content of an image. Until recently, it was a fundamental problem in the artificial intelligence area that connects two emerging fields: computer vision and natural language processing.
The work is inspired by the recent advances in machine translation such as Google translate application. For many years, machine translation was achieved by a series of separate tasks, but recent work has shown that translation can be done in a much simpler way and still reach greater performance.
The paper proposes a a neural and probabilistic framework to generate descriptions from images. These models make use of an architecture called called Recurrent Neural Network (RNN) that combines recent advances in computer vision and machine translation and that can be used to generate natural sentences describing an image. The model is trained to maximize the likelihood of the target description given the training image.
They present an end-to-end system for the problem. It is a neural net which is fully trainable using stochastic gradient descent. then, their model combines sub-networks for vision and language models. They can be pre-trained on larger corpora and thus can take advantage of additional data.
This paper was the result of a competition for an image captioning challenge, organized by Microsoft in 2015, where the team contested and won and placed equal first with Microsoft Research. The authors discusses the resulting performance in the competition and experiments on several datasets.
Step 1 - Big Problem
With the growing complexity of artificial neural networks, the critical need for understanding their inner-workings has increased. It is being used to guide all sorts of key decisions in medicine, finance, manufacturing and beyond, including determining who gets healthcare, who’s approved for a loan and who gets hired for a job.
Step 1A - Big Problem with Data
Step 2A - Stakeholder map
Step 2B - Stakeholder list
Step 3A Personas
Data scientists and engineers, who are facing challenges that demand more transparent and explainable systems for better understanding and analyzing machine learning models in their projects, especially their inner working mechanisms.
An interactive visualization of these models is needed to effectively solve some of the real world challenges. An interactive model could help data scientist understand, diagnosing, and refining a machine learning model with the help of interactive visualization.
Step 3B - User List
Area: Machine Learning
Topic: Neural Networks
Brainstorm
Freewrite
Mind map
Question - Response - Question
Augmenting Human Intellect
Explainable AI
Why Neural Network is a black box?
Wheel and Spoke
Alex Nathanson and Jason Charles are my accountability partners. We will be meeting briefly before pre-thesis every Tuesday.
Journal Articles
Review Articles
The five articles I read were:
Visual Experiments
Article 1:
The author examines that in order to discover how the brain works, we need to look to higher dimensions. The research states as our brains think, learn and remember, they create elaborate but ephemeral structures in mathematical dimensions. These structures which appear and disappear, could help us understand how the brain creates our thoughts and feelings.
What does this do for me?
The research finding intrigued me where they have been using algebraic topology, a field of mathematics used to characterize higher- dimensional shapes, to explore the workings of the brain.
What is interesting about this article?
The blue brain project launched by the researchers aimed to simulate the entire human brain inside a computer. Not only have did they managed to do all their research in computer but they have also managed to replicate some if it’s finding through actually biological research.
What questions is this making me ask?
What are those higher dimension structures? What if the geometric structures they represent exist in mathematical dimensions higher than we can visualize?
Article 2:
The article discusses the idea to reverse engineer the brain mechanisms that underlie human visual intelligence.
What does this do for me?
The idea that a better understanding of the brain could lead to better computer algorithms and AI deepened my interest to learn more about the brain mechanism.
What is interesting about this article?
The Beautiful Brain, a 3-D interactive visualization based on the drawings of Santiago Ramón y Cajal interested me with the visual aspect of the human intelligence.
What questions is this making me ask?
Article 3:
The article examines autonomous vehicles, sensing technologies, indigenous cartography, and non-human spatial sensibilities.
What does this do for me?
The capacities and limitations of machine-mapping and autonomous spatial technologies needs attention.
What is interesting about this article?
Mapping made for and by machines is a big business opportunity at present. Yet mapping’s artificial intelligences also have the potential to transform myriad of design and research areas, to influence policy-making and governance, to support environmental preservation and public health.
What questions is this making me ask?
How machines conceptualize and operationalize space? How do they render our world measurable, navigable, usable, conservable?
Article 4:
Tomaso Poggio, the director of Center for Brains, Minds, and Machines (CBMM) at MIT’s McGovern Institute for Brain Research discuss the implications for human intelligence.
What does this do for me?
The artificial neural networks, is modeled loosely after the human brain, They can train themselves to recognize complex patterns, and provide solutions to many problems in image recognition, speech recognition, and natural language processing.
What is interesting about this article?
The author explains that problem of intelligence is not only the engineering problem of building an intelligent machine, but the scientific problem of what intelligence is, how our brain works and how it creates the mind.
What questions is this making me ask?
What are the major technology trends driving Deep Learning? How to build, train and apply fully connected deep neural networks.
Article 5:
Embedding Projector, a system for interactive visualization and analysis of high-dimensional datasets used in machine learning projects. Researchers and developers often need to explore the properties of a specific embedding, and one way to analyze embedding is to visualize them.
What does this do for me?
A new tool that could help users interpret machine learning models visually could help reduce common problems in machine leaning such as overfitting and underfitting that would eventually lead to better accuracy.
What is interesting about this article?
How the embedding projector helps reduce the arbitrary high-dimensional data into a simple two- or three-dimensional view is interesting.
What questions is this making me ask?
What are the major technology trends driving Deep Learning? How does it apply dimensionality reduction techniques such as t-SNE and PCA for an arbitrary dataset.
Areas of Interest
Inspiration cards
Neural Networks
Power of Algorithms
Prediction Machines
Dark Matters
API for Non-Profits
Initial Review Investigations
How neural network works?
Algorithms are already all around us. It is what drives Google’s search engine, powers Netflix to figure out what we want to watch next, Amazon Alexa’s voice assistants, Tinder’s match-making, autonomous vehicles, high-speed trading and an ever-growing number of services and technologies.
Artificial Intelligence will affect everyone, everywhere. The past few years have seen rapid advances in AI, with new technologies achieving dramatic improvements in technical performance.
Where does this ultimately lead? Software that thinks and does. Software with cognitive ability that predict people’s behavior and drive autonomous vehicles etc.
APIs are all about sharing. It's an interface that lets one software program “talk” to another, exchanging data behind the scenes. APIs power much of what we do online. Startups and established companies alike are leveraging the technology to become a bigger part of our lives.