Tom Bishop is Chief Technology Officer at Glass Imaging. He is an algorithms expert with 15 years experience in AI/Deep Learning, Computer Vision, Computational Photography and Image Processing. From 2013-2018, Tom developed core technology at Apple powering the iPhone’s Portrait Mode. Enjoy his insights on future trends and his perspective of bias.
#360pulse, #circleoptics, #computationalphotography, #imaging, #innovation, #mobiography, #podcast
What was the earliest memory you have as an innovator?
Growing up, I embarked on several journeys that have shaped who I am today. One of my most influential experiences was diving into the world of computer games. I received my first computer, a Commodore 64, when I was just seven years old. The Commodore 64, a computer from the 1980s, might not ring a bell for many today, but for those of us who grew up during that era, it was a game-changer. The uniqueness of this computer was that it often required one to have a basic grasp of programming. I remember using cassette tapes to load games, but it wasn’t long before I became fascinated with the idea of creating my own games. This led me to learn programming languages like BASIC and assembly, and I began to explore the graphics behind games.
As time went on, my interest shifted. When I got a Mac computer later in life, I delved deeper into graphics, mastering tools like Photoshop. Initially, I envisioned a future for myself in game development, but my passion swayed towards the graphics and imaging side. This newfound interest spurred me to take up photography. In fact, during my student years, I worked as a freelance photographer. This not only provided me with extra income but also deepened my appreciation for the art and science of imaging. I became captivated by the convergence of technology, mathematics, physics, and optics in the realm of photography.
What Were The challenges involved in developing the technology behind portrait mode?
Apple is known for secrecy, so I will share what I can.
At a high level, I believe my hiring at Apple was influenced by the work I did during my PhD and postdoc, which was highly relevant to their objectives. Many companies, like Google with their Pixel, have launched similar portrait mode algorithms. Essentially, these phones aim to separate the foreground from the background in images.
The real challenge was rooted in Steve Jobs’s vision, inspired by companies like Lytro. They were working with light field cameras in the mid-2000s, which enabled artificial refocusing after a photo was taken. While their approach used complex hardware, Apple’s vision was to mimic the shallow depth of field look, characteristic of DSLRs, on a smartphone.
The challenge was estimating depth from an image. Typically, smartphone cameras have small apertures, resulting in sharp backgrounds. In contrast, a DSLR or a professional mirrorless camera, with its larger sensor, gives a blurred background, creating a sought-after aesthetic. To achieve this effect on a smartphone, we introduced a second camera with the iPhone 7 Plus. Using both cameras, the system could gauge depth via stereo vision, correlating pixel parts of the image. This method of stereo vision, although not new and used in various industries, was revolutionary for phones.
Implementing a second camera was a significant step. Our next challenge was developing algorithms to run efficiently on phones with limited computational power at the time. A major part of our work was creating algorithms that estimated depth efficiently in near real-time, then using that data to blur the background, simulating the bokeh effect in photography.
Over the years, these methods have evolved, with deep learning now playing a significant role. Many methods are available to estimate depth, but our fundamental approach was pioneering.
What Made You WaNT to Leave APPLE To work for A Start-UP?
While I spent several years at Apple, I was always intrigued by the startup world, especially here in the Bay Area, California. Many of my friends ventured into startups, and their stories were captivating. Prior to Apple, I even worked at a startup in London for a couple of years, so I wasn’t entirely unfamiliar with that landscape. However, the entrepreneurial spirit here in the Bay Area is distinct. It seems almost everyone you meet in San Francisco is passionate about their startup.
Being at a giant like Apple, you realize the profound impact a big company can have. It’s undeniable that innovations in smartphone cameras over the last decade, many of which I contributed to, transformed the camera industry and manufacturing technology. These advancements were driven by the public’s desire for better photography. That’s one of the benefits of being at a major firm.
However, there are drawbacks. Large companies inherently have bureaucracy, organizational hierarchies, and other structures that can stifle certain types of innovation. It reminds me of the book “The Innovator’s Dilemma,” which delves into this very issue. While significant innovations can be made when a whole corporation backs a vision, groundbreaking, paradigm-shifting ideas often find a more nurturing environment in startups.
At my core, I’m a researcher with a drive for continual creation. While I was lucky to innovate within a massive organization like Apple, there comes a point when your vision surpasses what can be achieved within such confines. That pull towards unrestrained innovation is what drew me back to the world of startups.
How do You address the challenge of training data for imaging purposes In a Small Company?
Indeed, data annotation is a cornerstone challenge for many startups, especially in the realm of machine learning. Fortunately, we’ve positioned ourselves in a sweet spot where we can generate our own training data, blending simulation and real-world captures. Data annotation isn’t just resource-intensive; it often introduces biases based on annotator opinions or labeling choices. If possible, minimizing human interference is advantageous. Techniques like self-supervised learning, which I explored heavily in my previous venture, offer potential solutions. With such methods, data essentially labels itself, or we employ minimal human input to make corrections. It’s fascinating.
how do you see computational photography evolving with the advent of these new AI technologies?
Computational photography has evolved significantly. Previously, it relied on handcrafted techniques demanding skill and mathematical prowess due to limited computation. Now, the industry has shifted to a data-driven approach, learning from vast numbers of images and understanding the nuances of various imaging systems. This shift allows us to simulate intricate aspects of physical processes like optics or sensor behaviors. Essentially, computational photography seeks to address hardware limitations via software solutions.
At my company, Glass Imaging, we’re pioneering a co-design approach, intertwining the lens or optical system with advanced algorithms, such as deep learning. This contrasts traditional methods that merely attempt to refine what the camera produces. Instead, we’re holistically optimizing both hardware and software to produce superior images. Recent advancements have also enabled us to run deep learning models directly on devices like smartphones, right where the data is captured. It’s remarkable to think that techniques once reserved for supercomputers can now operate on the devices in our pockets.
what are your thoughts on ethical considerations, especially in image processing and recognition?
Technology itself is neutral; it’s how we use and interpret it that truly matters. It’s crucial for policymakers and society at large to be well-versed in the fundamentals of AI and deep learning. This means not just halting technological advances out of fear but ensuring that everyone understands the capabilities and limitations of these technologies. Better STEM education, especially for those entering governance, can facilitate informed decision-making.
When it comes to bias, it’s a fundamental part of both machine learning and human cognition. Bias enables learning and differentiation. However, understanding and managing bias is paramount. In machine learning, bias helps in distinguishing data points and recognizing essential elements. I delved into anomaly detection in my last startup, where understanding and detecting outliers was crucial. The challenge is discerning between important variables and noise. Bias becomes an issue when the data being fed into the model is skewed or misinterpreted.
I believe bias is inherent to learning. We need to ensure the data used is accurate, and the results are interpreted correctly. We must always be asking the right questions and remain vigilant in our analyses.