Jake Austin

About Me

I’m doing my EECS PhD @ MIT, advised by professor Kaiming He. I did my B.S. and M.S. in CS at UC Berkeley, advised by professor Angjoo Kanazawa.

Broadly speaking, I’m interested in ML methods capable of escaping from Plato’s cave, that is to say, learning to reason about the 3D world from monocular data sources.

I care a lot about education and teaching. I spent a lot of my free time teaching while at UC Berkeley; course staff twice for CS 182 (Deep Learning) and once as a TA for CS 180 (Intro Computer Vision and Computational Photography). I also founded / led / was lead instructor for my own class CS 194-126 (Deep Learning for Computer Vision), which was organized in conjunction with the organization Machine Learning at Berkeley that I was active in during my undergrad. The aim I had for the course was to bridge the gap between supply and demand of ML education at UC Berkeley, taking in students with elementary python, calculus, and linear algebra knowledge, and teaching from fundamentals all the way up to bleeding edge computer vision used in industry with an emphasis on implementation and practice. I led this course in its first semester, which had an enrollment of ~80 students. The education committee of ML@B have continued this course in subsequent semesters, teaching over 200 students at the time of writing.

Research Interests

At a high level, I’m currently most interested in ML methods capable of escaping from Plato’s cave.

Real world ground truth 3D data is often difficult to collect or makes simplifying assumptions about the world being static. This can limit scalability of 3D understanding across categories (like animals for instance, which won’t always stay still long enough for you to move a camera around them) in the real world. My research focus is in trying to get around that by trying to force explicit 3D understanding from wild monocular data, the likes of which are prolific and readily available. This means identifying novel learning algorithms / architectures / etc that can leverege and scale 3D understanding from internet scale images and videos in the wild.

In general, I’m interested in seeing computer vision used in the real world. Whether its in real world robotics or artistic endeavors, I want to see individuals empowered by ML systems that can work in the real world.

Publications

Shape of Motion: 4D Reconstruction from a Single Video

Qianqian Wang, Vickie Ye, Hang Gao, Jake Austin, Zhengqi Li, Angjoo Kanazawa

PDF Code Project Page

NeurIPS

Differentiable Blocks World: Qualitative 3D Decomposition by Rendering Primitives

Tom Monnier, Jake Austin, Angjoo Kanazawa, Alexei Efros, Mathieu Aubry

(NeurIPS), 2023.

PDF Code Project Page

SIGGRAPH

Nerfstudio: A Modular Framework for Neural Radiance Field Development

Matthew Tancik, Ethan Weber, Evonne Ng, Ruilong Li, Brent Yi, Justin Kerr, Terrance Wang, Alexander Kristoffersen, Jake Austin, Kamyar Salahi, Abhik Ahuja, David McAllister, Angjoo Kanazawa

(SIGGRAPH), 2023.

PDF Code Project Page