Could you start by sharing a brief overview of your research and its applications?
I work at the intersection of machine learning and economics, and the motivation is really thinking about how we use machine learning algorithms in the real world, and what happens when we actually use them. In the real world, algorithms have to interact not only with people, but potentially also with other algorithms, and they have to deal with situations that we’ve historically seen machine learning algorithms struggle with. If you think back to applications of machine learning in the real world, there’s all these famous failures. There’s self-driving cars that crash because they can’t tell the sky from a truck, or that won’t cross an intersection because they can’t figure out the intent of other agents that might be present. There’s also all types of problems that happen in e-commerce markets, in social networks, and so on. And you can really think of these issues as arising because learning algorithms are coming into contact with people and other algorithms.
A big part of my research is just trying to understand what can go wrong when we use algorithms in environments containing real people who might have their own objectives. We’re also thinking about how we would design algorithms specifically for these environments so that they can have provable guarantees or at least economically meaningful measures of performance. My main application areas are in transport systems (ride-sharing platforms are a major example of this), e-commerce platforms, social networks, the delivery of healthcare — all of which have this kind of flavor of algorithms meeting people and interacting with other algorithms.
What do you see as the short-term or long-term directions of your work and the field as a whole? What is the end goal you would eventually like to reach?
The kind of pipe dream is this idea that we can develop algorithms that actually have a beneficial impact in society. And what I mean by that is having algorithms that don’t make things worse when they’re actually deployed in the environment — instead, they actively improve things. Maybe if it’s in transport systems, it improves congestion in cities. Maybe if it’s on e-commerce platforms, it lowers prices and makes the market more equitable. But doing this while keeping in mind issues of fairness and biases in data is really hard. But I think it’s a really important problem, because people aren’t going to stop using algorithms in the real world, and we really need to start figuring out how we do this properly, fairly, and ethically. That’s a big part of what my research is trying to do.
Is there anything you’re really excited about in your work right now, such as a current project or new idea?
Theoretically, we were able to show that one of the most important aspects when you’re using learning in the real world is how fast you’re learning. Often, we think that learning fast is always better, but when we started applying these types of ideas in games — where you’re learning against an opponent who’s also learning — we found it might not be better. And actually, the speed at which you learn changes the equilibria you might find, but different equilibria have different properties. Some might be more socially optimal, some might be better for individuals, and so on. So all of a sudden, we now have this way of achieving different solutions by changing how fast we learn.
Something I’m really excited about is that this is a phenomenon we’re seeing in real data from pricing on e-commerce platforms, for example. We see that people change how fast they update the prices relative to their competitors to get an advantage. Sometimes people update their prices very very slowly, and then they’re actually better off and are able to get more of the market. In other cases, they update much more quickly and can be better off as well. So there’s this interesting degree of freedom, which I don’t think has been explored much. I’m very interested in exploring the dynamics of learning algorithms in these game theoretic settings, which I think will be a really exciting area in the next couple of years.
What are the biggest challenges that come with performing this research? What are the things you have to do to get your work moving forward?
A lot of my work is more on the theory side, mostly trying to develop algorithms that satisfy some theoretical guarantees and seeing that they work in practice. There’s two main challenges: on the data side, a lot of companies and entities that use algorithms in the real world don’t want to share their data because they don’t want to show that they’re doing something that isn’t working, or they’re scared of revealing some sort of inherent bias in their data. On the theoretical side, you have to bring in ideas from a lot of different fields to try and understand these problems. You need ideas from game theory and economics to deal with these issues of multiple agents all interacting, but then you need ideas from control and dynamical systems to think about the dynamics that are arising in these situations. And you need to bring in ideas from computer science to deal with the algorithmic parts. It’s really a research agenda that’s at this intersection of three different domains, so there are natural challenges regarding how they interact and whether you can make these things play nicely. It’s a really cross-disciplinary research agenda, and that comes with challenges, but also a lot of opportunities.
What has it been like working in both the CMS and economics departments at Caltech?
I only recently started, but it’s been a very satisfying experience because from what I’ve seen, economics and the computing sciences are very tightly integrated at Caltech. The economists are familiar with a lot of the tools and ideas from computer science and engineering, and the people in engineering also have some connections with economics, so it’s been really interesting to work at this interaction and play both sides. In my view, Caltech is extremely well positioned in this new research area that’s emerging at the intersection of machine learning and economics. We have a very strong mathematical microeconomics core, along with very strong experimental economists that understand decision making, whose ideas we can then embed into our algorithms. On the other hand, it has a very strong computer science and mathematical sciences department. So it’s been a really fun, unique opportunity, and part of that is only possible because of the small size of Caltech, which makes it easy to work between divisions and departments.
What drew you to this area to begin with? How did you get started in either machine learning or economics?
The way I got into computer science, machine learning, and economics is a bit weird. I basically started by getting my hands on data. We were looking at data about how people park their cars in Seattle, and there’s this idea that a large part of the congestion in Seattle is caused by people just looking for parking. So the question was whether we could come up with a model for this and see if this is true. Something very natural that emerged from this problem is to model parking as a game between people: those trying to drive on the city streets and those looking for parking spots. We started doing that, and that got me thinking a little bit more about how we can even use learning algorithms in game theoretic settings. From there, I kind of moved more and more towards machine learning and game theory, and then economics more broadly. So it was not an immediate process — I didn’t go into grad school knowing that this was what I wanted to do, but it was a very natural thing because this research area allowed me to draw on ideas from game theory, which I thought was fascinating, but also apply tools from machine learning, dynamical systems, and control, all in the context of interesting, real-world problems.
What were your mentors like as you were working in multiple fields? What has your experience been as a mentor to students of your own?
What may be surprising is that in my undergrad, I was actually doing computational biology. I had a set of really good mentors who allowed me to explore my interests in that field and help me discover that the thing that excited me most in computational biology was actually mathematical modeling and coming up with dynamic models of how, for example, drugs propagate through biological systems. When I got to Berkeley for grad school, I again had mentors who could really get me thinking about big open questions and the emerging themes and questions that were arising as machine learning and algorithms are deployed in the real world. So I’ve been extremely lucky throughout.
Here at Caltech, there’s two main parts of mentorship for me. The first is giving people the freedom to be creative and come up with their own ideas while giving them guidance through high-level research goals and things like this. The second thing is mental health. I’ve had a lot of close friends struggle with mental health in grad school and undergrad, so for me, a big part of mentorship is making sure the person feels very supported on the day-to-day and has something to fall back on. A lot of these environments, grad school and even undergrad at such an academically rigorous place, can be very high-pressure, so you need to have strong support networks for students.
What are some of your favorite parts about being a researcher and professor?
The best part of doing research is being able to talk to people and all of the random conversations you have in hallways or in front of white boards. It’s also just interacting with students and people with such different backgrounds and interests. That’s kind of the standard answer, but it’s really the most interesting part for me, especially at Caltech where you can meet so many people who do so many different things. You can go from a conversation about fluid mechanics to game theory and auctions in the same hour. It’s all very exciting.
Do you have any advice for students who want to get into research, especially within your field?
As I mentioned, I was in bioengineering when I started undergrad, and I ended up in EECS (electrical engineering and computer science); now, I’ve moved towards economics. I think being multidisciplinary and having varied interests is a good thing. Interesting things happen at the intersection of fields, so don’t be scared of trying to make connections between fields.
Getting involved at Caltech, I think, is really a question of persistence. If professors aren’t answering (and I’m definitely guilty of this, too), it’s not because they don’t want to — it’s just that they get too many emails. We want to work with students at Caltech since you’re all so good, so it’s very important to not be afraid to just send the emails and knock on doors. I was definitely worried when I sent my first few emails looking for positions, but in the end it’s just biting the bullet and doing it.
Professor Frances Arnold is the Linus Pauling Professor of Chemical Engineering, Bioengineering and Biochemistry at Caltech and a 2018 Nobel Laureate in Chemistry. Her research focuses on using directed evolution to create new enzyme function. Prof. Arnold received a B.S. in Mechanical and Aerospace Engineering from Princeton University and received a Ph.D. in Chemical Engineering from the University of Berkeley.
Favorite book: Too many to choose from. I am now enjoying Ai Wei Wei’s memoir
Favorite Place: Planet Earth
Favorite Food: oysters
Favorite Protein molecule: cytochrome P450 BM3
What type of research does your lab focus on?
We breed molecules like you can breed cats and dogs, selecting for the traits that are useful to us in a process that mimics evolution by artificial selection. The ‘directed evolution’ methods we pioneered to engineer better enzymes are now used all over the world for everything from laundry detergents, cosmetics, treating disease to making jet fuel from renewable resources. I envision a world where nature’s clean and efficient chemistry replaces dirty human manufacturing processes. Wouldn’t it be wonderful to make our pharmaceuticals, cosmetics, fuels and chemicals like we make beer? Wouldn’t it be wonderful to have microbes eat our plastic wastes and convert them to valuable new materials? That’s all possible with enzymes. But someone has to make them do what humans want rather than what they do in nature. That’s what we do.
What aspects of research do you most enjoy?
I love discovering new things that enzymes can do. Sharing the joy of discovery with others is almost as much fun. And since my students are always discovering new things, it’s a blast to go to work.
What influenced you to pursue science? Did you have any role models growing up?
My father was a nuclear physicist and also did a lot of engineering design. I wanted to be like him when I grew up, but I didn’t love nuclear physics. I tried a few things before I found protein engineering, at the beginning of the DNA revolution.
What was the origin for your directed evolution idea? When did you first think of it?
When no existing design approaches worked to make better enzymes, I was getting a bit desperate. I decided to try lots of mutations, easily done randomly using error-prone polymerase chain reaction (PCR) and testing by screening in plates. I was thrilled when we found beneficial mutations in surprising places, where no one would have predicted. We accumulated those mutations in sequential rounds, and sometimes recombined them, just like breeding cats or dogs over multiple generations. That worked beautifully, and evolution became the only reliable method for generating better enzymes. It was an instant ‘hit’ with people in industry, but took much longer to gain the respect of scientists, who always want to understand why. To understand why, however, it’s much nicer to start with the correct answer and work backwards. Directed evolution gave the correct answer.
How did you become the Nobel-prize winning scientist you are today?
I don’t really know for sure, but it probably has something to do with Caltech. I came here naïve and with limited training, but a good idea. I was able to implement my ideas here, and formulate even better ones with the help of critical colleagues and brilliant students and postdocs. I never stopped to worry whether it would be possible to make better enzymes or whether I would be able to do it. Nor did my students.
What is your work in Washington like?
Nobel Laureates are always warned not to get involved in things we know nothing about. I threw that advice to the wind when I took on co-chairing the President’s Council of Advisors on Science and Technology (PCAST) in January, 2021. I find it fascinating. And I have already learned that science is easy compared to dealing with…people. People are really complicated. We can have all the right science, but if people don’t trust it or won’t use it for other reasons, it goes nowhere. In this job, I am also seeing many fascinating new science problems and solutions.
I see this work (which is unpaid and takes a lot of time, by the way) as a way to pay back, to make sure that future scientists will have the same wonderful opportunities that I enjoyed.
What are some of the challenges you’ve recently faced? How do you overcome them?
We all face challenges: loss of loved ones, illness, failure. If you have lived as long as I have, you will experience all of those, perhaps multiple times. I find that the best way to face life is to be grateful for what I have, which is so much. I try not to dwell on what I have lost, nor do I feel sorry for myself. I am grateful to still be healthy and able to do something useful.
What are your interests outside of science?
I love music, gardening, hiking, meals and conversation with friends. I used to love travel, but I haven’t been many places lately…
What advice would you give to Caltech students and other young students interested in research?
Predicting the habitability of planets is closely tied to the amount of water, but understanding how water was delivered to the Earth is not well understood. Currently, we know there is a connection between the bulk composition of Earth, essentially the Earth’s major composing elements, and the total water content. A key component in explaining this connection is chondrites, which are early unaltered meteorites from 4.5 billion years ago. Chondrites can help explain this connection by giving us insight into the thermochemical environment during the early stages of planetary formation. By comparing current chondrite samples to the results of model, we can better understand the mechanisms that allowed terrestrial to form in the inner disk and how their bulk compositions evolved.
Author: Rohan Iyer California Institute of Technology Mentor: Yoshinori Miyazaki Department of Computing & Mathematical Sciences, Caltech Editor: Agnim Agarwal
Understanding the processes that could produce bulk compositions similar to Earth’s is crucial in constraining how and where planets formed and grew in the early Solar System. A key component in understand these processes arise through analyze the evolution of chondrites. Chondrites are early unaltered meteorites from 4.5 billion years ago and they are likely candidates for the building blocks of Earth. In particular, the chemical composition of chondrites has a puzzling characteristic as chondrites exhibit surprising depletions in refractory elements including Al, Ca, and Mg. However, we still lack an astrophysical model to bridge the gap between the early stage of protoplanetary disk evolution, dense solar gas and dust rotating around a young star, and the formation of planetesimals, growing planet cores. The formation of planetesimals within the evolution of the protoplanetary disk still remains unclear; To this end, we built a self-consistent model solving for the disk evolution from the dust condensation to the planetesimal growth stage. The key mechanisms for planetesimal formation considered in the study are pebble accretion, streaming instability, and the transition of magnetorotational instability (MRI) modeled through . In our model, we consider the effects of different values due to MRI as well as dust to gas ratio thresholds and accretion timescales that onset streaming instability. Our results demonstrate that the turbulence from an transition from MRI to in conjunction with streaming instability and pebble accretion creates lower pressure regions in which dust and pebbles accumulate within the terrestrial region. However, our model doesn’t account for two chemically distinct Mg-rich reservoirs which are necessary for explaining why Earth and enstatite chondrites share similar isotopic composition but different bulk compositions. We anticipate that our model will be used to constrain the viable parameter space for the formation of planetesimals as well as be a motivation to explore the mechanisms that produce a greater depletion in refractory elements.
Bridging the gap between the chemical and astrophysical mechanisms is still largely debated. A key aspect to understanding this disconnect is by looking at chondrites. Chondrites provide a connection that can be used to constrain the building blocks of Earth and the thermochemical environment during early protoplanetary disk formation. One of the challenges in understanding the bulk composition is the lack of understanding in how the elemental compositions of chondrites evolve. Early planetary formation is supported by condensation theory which suggests that planets condensed from gas in the early nebula. As the solar gas dissipated over time, the protoplanetary disk cooled which suggests that volatile elements are expected to be depleted. However, chondrite samples show a surprising depletion in refractory elements like Ca, Mg, and Al (Wasson and Kallemeyn, 1988). One consideration for the depletion of refractory materials is that they were delivered to the inner disk because of their condensation temperatures. Refractory materials have higher condensation temperatures than volatiles, ~1500-1800 K for Al and Ca rich solids. Thus, it is possible that the refractory solids were able to diffuse inwards and sustain the higher temperatures of the inner-disk, resulting in the larger depletion of refractory material than volatile material.
When considering the astrophysical dynamics, the mechanisms that create terrestrial planets are still controversial. Accretion processes often consider two main objects: pebbles and planetesimals. Pebbles are millimeter to centimeter sized bodies, while planetesimals are kilometer sized objects. A promising mechanism in which the orbiting solids gravitationally collapse onto the growing planetesimal is streaming instability, which occurs when the dust to gas ratio or pebble to gas ratio surpasses a threshold. Once planetesimals form, they have been theorized to grow efficiently through pebble accretion. A key aspect of pebbles is in the way they interact with the gas flow. Pebbles are marginally coupled to the gas flow which facilitates planetesimals accreting material from the entire gravitational reach of the planet. The headwind experienced by the dust and pebbles particles allows them to lose angular momentum and accrete efficiently. This allows planetesimals to dissipate their kinetic energy and become gravitational bound to nearby large bodies, yielding planetesimal growth rates much higher than previous accretion rates considering planetesimal accretion (Johansen & Lambrechts 2012).
These mechanisms were suggested for gas giants like Uranus and Neptune, and so our model considers how streaming instability and pebble accretion characterize the inner-disk evolution of terrestrial planets. In particular, we consider the effects of streaming instability, pebble accretion, and MRI simultaneously which hasn’t been examined in previous studies. As such, we are looking to test different thresholds to onset streaming instability and see if they yield terrestrial planets consistent with our observations.
In our model, we consider both condensation theory and pebble accretion to investigate how and where planetesimals form as well as the loss of minerals in the inner region. The thermochemical evolution of the inner disk has been discussed in previous studies (Pignatale et al., 2018; Miyazaki and Korenaga, 2021), but previous disk models haven’t considered the major element chemistry in conjunction with astrophysical phenomena like MRI and pebble accretion to track the evolution from dust to planetesimal. Our model records the specific major element composition (Al, Ca, and Mg) over the entire protoplanetary disk evolution ( 1 Myr) from dust to planetesimal while incorporating these astrophysical principles. This allows us to constrain the astrophysical and thermochemical parameters and compare the final elemental compositions of our model to observations in chondrite samples. A key component of this model is implementing values (measure of turbulent viscosity) considering the changes in turbulence within the inner disk. A key mechanism which onsets changes in turbulence is MRI. MRI works to create macroscopic turbulence and transport angular momentum, and the MRI transition has been suggested as a way to create a seed for terrestrial planets (Morbidelli et al., 2020). Therefore, an transition due to the onset of MRI could be a possible mechanism for planetesimal formation.
To understand the conditions that enable streaming instability, we first considered the early stage of disk evolution: transport of gas, dust, and pebbles, before planetesimals are formed. The results showed a sizable dust depletion inside 3 AU where much of the dust was being converted to pebbles but not enough of an enrichment in pebbles to surpass the streaming instability threshold. We implement an transition due to the onset of MRI that directly changes based on the mid-disk temperature so higher temperatures result in stronger turbulence. Our results demonstrate that a significant value change in the terrestrial region creates low pressure regions in which dust accumulates and eventually grows into planetesimals.
Our model solves for radial motion of gas, dust, pebbles and planetesimals within the protoplanetary disk. The motion of gas is described by the diffusion equation:
For solid materials, we solve for the dynamics of the three elements (Al, Ca, and Mg) using the advection-diffusion equation. The first term on the right hand side describes advection, and the second term is dissipation based on the concentration gradient, where denotes the surface density of species i, t is time, r is the distance from the Sun, is the advection velocity, is viscosity, and is the total surface density.
For each element, we consider the evolution from dust to pebble to planetesimal while solving for each stage’s dynamics separately according to Eq. 1. Pebbles are mm-cm sized bodies formed through the collisions of the dust particles. Relative to dust particles, pebbles travel at higher advection velocities and accrete onto growing cores quickly because of their interaction with the gas flow. Pebbles are especially important because of their ability to deliver minerals to the inner regions of the disk as a result of their larger size and drift velocities. The formation of pebbles is described by the dust to pebble formation timescale, , and the Stokes’ number, . Additionally, the evolution from pebble to planetesimal is similarly solved for but characterized by the streaming instability condition, the solid to gas ratio. The streaming instability condition is especially significant because it’s the main mechanism in the formation of planetesimals. In our study, we consider various timescales and pebble grain sizes in order to determine the most likely condition for the onset of streaming instability. We vary pebble to planetesimal timescales and streaming instability conditions in order to investigate how these parameters affect regions of elemental depletion as well as planetesimal formation and composition.
2.1. Disk Model
The model we built tracks the density and composition of solid particles of different sizes (dust, pebble, and planetesimal), as well as the temporal and spatial evolution of gas, temperature, and pressure. Our model solves for the evolution of the disk, spanning from 0.05 to 30 AU with 600 spatial intervals, where the dynamics, thermal structure and composition are self-consistently solved at a time step of 0.1 years. Initial chemical characteristics are defined by the elemental ratios for Al, Ca, and Mg which are .0391, .0372, .9236, respectively. Along with this, we define astrophysical constraints for the planetesimal as having a diameter of 200 km and a material density of 3000 kg/m.
2.2. Thermal Structure
The thermal structure of the disk is solved using energy balance between blackbody radiation and viscous dissipation:
where is the Stefan-Boltzmann constant, is the effective temperature, and is the local Keplerian angular velocity. The mid-disk temperature, , is then calculated using the effective temperature:
where and is opacity of the dust.
For each species i in the gas and dust components, its surface density is solved using a 1D radial disk evolution model. The advection velocity is calculated separately for the gas and dust components. The gas component is calculated using,
and for the dust component,
where is the normalized stopping time, is the gas density at the disk midplane, and is pressure. The pressure at the mid plane is a function of the surface density,
where is the sound velocity. The calculated Stokes Number is given by:
where Q* is the specific energy of fragmentation in which we adopt a value of , and is a measure of the turbulence. We consider both constant Stokes numbers and Stokes number distributions calculated from Eq. 7 as well various distributions in our model.
While cause and magnitude of turbulence in the protoplanetary disk is still an active field of research, in the early stage of a protoplanetary disk, both magneto-rotational instability and pure hydrodynamic turbulence are considered as sources of turbulent viscosity (Bai and Stone 2013; Nelson et al. 2013; Stoll and Kley 2014). MRI is caused by the effective magnetic field from moving electrons in fluids and the resulting Lorentz force from that magnetic field. These forces have a destabilizing effect on the angular velocity of the fluid. Many previous mechanisms for fluid turbulence have been explored including shear flow turbulence, but now the transition of due to MRI has been suggested as a way to create a seed for terrestrial planets. The macroscopic turbulence is characterized by in Eq. 4. Previous studies either consider a constant value or a grid of values including , , . We simulate the change in due to MRI with a temperature dependent distribution in which directly changes based on the mid-disk temperature. Therefore, higher temperatures result in stronger turbulence, especially in the inner disk. This allows the turbulence to evolve as temperature changes over time and better reflect the turbulent viscosity of the inner disk.
2.5. Streaming Instability & Pebble Accretion
For each element, the growth from pebble to planetesimal is modelled using pebble accretion. The mass accretion rate is defined as:
where is the accretion radius, and and are the pebble column density and mid-plane density, respectively. The accretion rate of pebbles is closely tied to the friction time of pebbles which is described by:
Since the focus of our model is the early stage of growth, we adopt the follow equation for the accretion radius () using the Bondi Limit:
(Johansen & Lambrecths 2017). For simplicity, we assume that planetesimals will be 200 km in diameter and have a material density of 3000 kg/m. The formation of planetesimals is triggered by the streaming instability condition. Pebbles are marginally coupled to the gas so radial drift allows pebbles collide and grow larger. As a result, they can gravitationally collapse onto the core if the dust to gas or pebble to gas ratio becomes too heavy. The theoretical value for the streaming instability condition has been suggested to be characterized by a dust to gas ratio > 1. In our model, we consider values of , , as plausible values for the streaming instability condition. Note that we label the the change in from to constant and to constant , MRI 4 and MRI 3 respectively, and we will use these naming conventions throughout the figures.
Our model tracks the evolution of the major element chemistry through dust to planetesimal. Throughout the evolution of the disk, the temperature profile of the disk cools as the nebular gas cools and dissipates over time. Gas and dust from the outer disk through advection and diffusion travel towards the inner disk in which dust and pebble particles can accumulate and grow over time (Figure 2). This model builds upon previous models but incorporates key ideas like non-uniform turbulence due to MRI and pebble accretion. The implementation of turbulence due to MRI points to new considerations for the formation of planetesimals in the terrestrial region. The turbulence within the disk has a significant impact on the dust accumulation and the streaming instability condition.
The dust accumulation can be attributed to the pressure within the inner disk (Figure 3). For larger differences in , there exist more low pressure regions within the disk that dust can travel into. These low pressure regions enable dust to form into pebbles and eventually planetesimals as dust flows from high pressure regions into these low pressure pockets. Additionally, the planetesimals formed must fit within reasonable constraints of terrestrial planets, particularly its mass. Terrestrial planets within the first 0.1 Myr should be on the order of kg and especially when considering the bulk composition evolution of Earth, the mass should be close to Earths. Two different -values are tested for the outer region ( and ), and its effect on the formation of planetesimals is illustrated in Figure 4.
Another parameter which impacts the growth of the planetesimals is the Stokes number. In this figure, we consider of 5, 10, and 20 times the . Terrestrial planets exist within a specific mass range, typically < kg. Comparing the Stokes number further helps us constrain the viable parameter space. Looking at Figure 5, there is a positive correlation between the Stokes Number and the planetesimals mass. In the first 100,000 years of evolution, planetesimals with masses ~ kg would not explain the evolution of Solar System as terrestrial planets have masses on the order of to kg.
Finally, we can constrain the elemental compositions and thermal structure of the inner disk. Chondrites are likely the building blocks of Earth but not one specific type of chondrite matches the Earth in terms of chemical and isotopic compositions. Earth is suggested to form from a Mg-rich reservoir, which existed just adjacent to the source region of enstatite chondrites, depleted in Al, Ca, and Mg. In order for the fractionation between these elements to occur, the temperature profile inside 1AU needs to be > 1000K which places further constraints on our parameter space. Incorporating a timescale, < 10,000 years would make the inner disk too cold which would increase the amount of ice existing around the terrestrial region. For a timescale of 100,000 years, more dust remains to support a high temperature inside 1AU. However, we don’t see enough of a depletion in Al and Ca to support having two chemically distinct Mg rich reservoirs (Figure 6).
By solving advection-diffusion equations in a self-consistent manner, our model considers pebble accretion and magnetorotational instability to simulate conditions in which streaming instability can occur. Our model suggests that creating planetesimals requires a significant change in turbulence which is measured by the dimensional parameter in order to create pressure pockets for dust and pebble accumulation. In our model, a transition from turbulence due to magnetorotational instability to a constant value of , MRI 4, is sufficient in creating planetesimals in the terrestrial region. Additionally, the total mass of the planetesimals heavily depends on the Stokes Number in which a pebble stokes number equal to 5 times the dust’s Stokes Numbers yielded feasible masses for possible terrestrial planets. Additionally, the timescale affects the streaming instability condition and temperature profile. For years and years, the model allows for a higher SI condition to be onset but also depletes dust too quickly which in turn results in heat dissipating too quickly as well. While the transition of from MRI to may trigger the streaming instability in the terrestrial region, the resulting compositions do not seem to explain the Earth nor enstatite chondrites in terms of the elemental compositions as we do not see enough depletion in Al and Ca. However, in a broader sense, our model works to connect two previously distinct areas of research, the minute chemical interactions of gas and dust within the early protoplanterary disk and N-body dynamics which are simulations once planets have already formed. While our model may not currently explain the formation of Earth, it’s possible that the parameters we considered could explain the composition of other exoplanet systems and that our model show key advancements in the evolution of the protoplanterary disk as well as in bridging the gap between cosmochemical observations and astrophysical dynamics.
5. Future Work
The next step for this model is to expand the viable parameter space. The parameters that were evaluated in this model help constrain viable astrophysical parameters but certain aspects like the depletion in Al and Ca are still not fully explained from our model. In addition, more work can be done to push the bounds of the streaming instability condition closer to the theoretical threshold. Inversely, finding the minimum thresholds to onset SI is an active area of research, so developments on these two fronts will help improve our model. On a grander scale, unifying our model of the early stage of evolution with the cosmochemical observations and later stages of planetary evolution like N-body simulations will give us a more complete depiction of the formation of terrestrial planets and the solar system.
I would like to thank Yoshinori Miyazaki for his continued mentorship throughout our research as well as the Caltech SURF program for supporting this project.
X.-N. Bai and Stone J. M. Wind-driven accretion in protoplanetary disks. i. suppression of the magnetorotational instability and launching of the magnetocentrifugal wind. The Astrophysical Journal, 769(1):76, 2013. doi: 10.1088/0004-637x/769/1/76.
S.J. Desch, Estrada P.R., A. Kalyaan, and J.N. Cuzzi. Formulas for radial transport in protoplanetary disks. The Astrophysical Journal, 840(2):86, 2017. doi: 10.3847/1538- 4357/aa6bfb.
J. Drążkowska and Y Alibert. Planetesimal formation starts at the snow line. Astronomy & Astrophysics, 608, 2017. doi: 10.1051/0004-6361/201731491.
Y. Miyazaki and J. Korenaga. Effects of chemistry on vertical dust motion in early protoplanetary disks. The Astrophysical Journal, 849(1):41, 2017. doi: 10.3847/1538- 4357/aa8cd1.
Y. Miyazaki and J. Korenaga. Dynamic evolution of major element chemistry in protoplanetary disks and its implications for earth-enstatite chondrite connection. Icarus, 361:114368, 2021. doi: 10.1016/j.icarus.2021.114368.
A. Morbidelli, G. Libourel, Jacobson Palme, H., S. A., and D. C Rubie. Subsolar al/si and mg/si ratios of non-carbonaceous chondrites reveal planetesimal formation during early condensation in the protoplanetary disk. Earth and Planetary Science Letters, 538:116220, 2020. doi: 10.1016/j.epsl.2020.116220.
F. C. Pignatale, S. Charnoz, M. Chaussidon, and E. Jacquet. Making the planetary material diversity during the early assembling of the solar system. The Astrophysical Journal, 867(2), 2018. doi: 10.3847/2041-8213/aaeb22.
M. H. Stoll and W Kley. Vertical shear instability in accretion disc models with radiation transport. Astronomy & Astrophysics, 572, 2014. doi: 10.1051/0004-6361/201424114.
J.T. Wasson and G.W. Kallemeyn. Compositions of chondrites. Series A, Mathematical and Physical Sciences, 325(1587):535–544, 1988. doi: 10.1098/rsta.1988.0066.
A. Youdin and A Johansen. Protoplanetary disk turbulence driven by the streaming instability: Linear evolution and numerical methods. The Astrophysical Journal, 662 (1):613–626, 2007. doi: 10.1086/516729.
Currently, researchers believe that there are only two regimes of galaxies, delineated by whether they produce stars or not. However, using a novel machine learning technique to estimate the temperatures of galaxies, we find evidence of a distinct third regime of galaxy evolution. To demonstrate this, we studied roughly 650,000 galaxies in the publicly available COSMOS dataset. Moreover, we identify a group of galaxies that belong to this third regime, providing clear visual evidence of our three-regime model of galaxy evolution.
Author: Patrick Rim California Institute of Technology Mentors: Charles L. Steinhardt, Adam Blank Cosmic Dawn Center, Niels Bohr Institute, California Institute of Technology Editor: Laura Lewis
Dimensionality reduction is an unsupervised machine learning technique used to discover and visualize patterns in a dataset by reducing its number of dimensions. We show that by applying dimensionality reduction to a dataset of galaxies, we can provide evidence for an updated model of galaxy evolution. The currently accepted model of galaxy evolution defines two regimes of galaxies: star-forming and quiescent. However, using a novel technique called IMF fitting to estimate the temperatures of galaxies, we find that the regime of star-forming galaxies can be subdivided into two distinct regimes. To demonstrate this, we studied roughly 650,000 galaxies in the COSMOS dataset, each of which has flux measurements in twenty-seven photometric bands. First, we preprocessed the data by removing galaxies with incomplete or erroneous measurements and normalized the photometric measurements to a single photometric band. We then applied the t-SNE dimensionality reduction algorithm to the preprocessed dataset, determining appropriate values for t-SNE’s parameters. Observing the resulting plots, we found that t-SNE clustered the galaxies together by regime and that the three clusters representing the three regimes were ordered sequentially on the map. Thus, our study provides evidence of three distinct regimes of galaxy evolution.
The universe is filled with stars, which were all formed at some point in time. Most existing stars were formed in the early stages of the universe, with star formation rates peaking about ten billion years ago . In comparison, galaxies today are not forming stars at nearly the same rate. In fact, star formation rates have dropped to about three percent of that peak . Following such observations, the currently accepted model of galaxy evolution defines two regimes: star-forming and quiescent .
We have recently discovered a way to estimate the temperatures of galaxies using a technique called Initial Mass Function (IMF) fitting, described by Sneppen et al. . Studying both the star formation rates and the IMF temperatures of galaxies, we find a small group of star-forming galaxies that have higher temperatures than other star-forming galaxies. Therefore, two distinct groups of star-forming galaxies are observed. This observation challenges our currently accepted model of galaxy evolution, which details just two regimes of galaxies: star-forming and quiescent. Thus, accounting for the newly-discovered group of star-forming galaxies, we propose an updated three-regime model of galaxy evolution.
In this paper, a machine learning technique called “dimensionality reduction” is used to provide experimental evidence of this new third regime of galaxies. In particular, we apply a dimensionality reduction algorithm called “t-SNE” , or “t-distributed stochastic neighbor embedding,” to the Cosmic Evolution Survey (COSMOS) dataset of galaxies which contains about one million galaxies and twenty-seven photometric bands for each galaxy . Using dimensionality reduction, we are able to visualize this high-dimensional dataset and correct for numerous issues that arise when analyzing data points in a high-dimensional space. The most important of these issues is that data points are too far apart to be grouped into clusters. . We describe and justify the methods we use to preprocess the COSMOS dataset, such as feature scaling and removing outliers. By preprocessing the dataset, we are able to produce meaningful plots that correctly cluster together physically similar galaxies. We show that the maps produced by t-SNE when applied to COSMOS provide clear visual evidence of our three-regime model of galaxy evolution.
This paper is structured as follows: Section 2 details the currently accepted two-regime model of galaxy evolution and our proposed three-regime model of galaxy evolution. Section 3 describes the COSMOS dataset of galaxies. In Section 4, we introduce and motivate the use of dimensionality reduction, as well as the t-SNE algorithm. In Section 5, we present the maps produced by t-SNE when applied to COSMOS. In Section 6, we summarize and discuss our results.
2. Models of Galaxy Evolution
2.1 Two-Regime Model
When we observe the galaxies in the Hubble Deep Field, a small region of space surveyed extensively by the Hubble Space Telescope, we see galaxies of various sizes, shapes, and ages . However, in the last fifteen years, a puzzling fact about “star-forming galaxies” has been discovered: star formation rates seem to be consistent across galaxies with varying features and properties. These galaxies comprise the “Star-Forming Main Sequence” , in which all star-forming galaxies seem to form stars at the same rate. We also observe “quiescent galaxies,” which are galaxies that produce stars at a low rate. The currently accepted model of galaxy evolution classifies galaxies as either star-forming or quiescent . There are two main reasons that astronomers accept this two-regime model.
First, multiple previous studies have shown that star-forming galaxies and quiescent galaxies can be clearly distinguished on a U-V vs. V-J (UVJ) diagram . The U-V color band contains wavelengths from ultraviolet to visible light while the V-J color band contains wavelengths from visible to near-infrared light. Plotting galaxies of similar redshift and mass on a UVJ diagram that compares each galaxy’s fluxes in the U-V color band and the V-J color band, we see that star-forming galaxies and quiescent galaxies are clustered separately on either side of a distinct boundary (Figure 1).
Second, in today’s night sky, we observe red and blue galaxies, where quiescent galaxies appear red because they are dominated by red, low-mass stars, and star-forming galaxies appear blue because they are dominated by blue, high-mass stars . We do not observe any galaxies that appear neither red nor blue. In this way, all star-forming galaxies appear to be identical—until we account for their IMF temperatures.
2.2 Three-Regime Model
Only the two regimes of “quiescent” and “star-forming” are apparent when studying the sSFRs, or specific star formation rates, of galaxies. This changes when we take into account the IMF temperatures of these galaxies, which are estimated with the IMF fitting method discovered by Sneppen et al. .
When we plot galaxies by log(sSFR) and IMF temperature, we see that galaxies with high values of sSFR, which are star-forming galaxies, span a wide range of temperatures. For example, in the hexbin plot of galaxies with a redshift value between 0.7 and 0.8, labeled a) in Figure 2, we see that there are galaxies that form a “horizontal branch” that have a different relationship between star formation rate and temperature than most of the other star-forming galaxies. While most star-forming galaxies have a similar temperature, the galaxies in this branch have a significantly higher temperature. This means that not all star-forming galaxies are physically similar, which contradicts the currently accepted theory.
Figure 2 shows six hexbin plots, each contain galaxies in redshift ranges of size 0.2, spanning from 0.5 to 1.7. Galaxies are binned by redshift because galaxies in different redshift ranges will appear different even if they are physically similar. This is because the wavelengths at which their fluxes are measured are significantly redshifted. The hexbin plot containing galaxies with redshift values between 0.5 and 0.7 is labeled, and all other plots should be interpreted identically.
We see that there are two distinct regimes of star-forming galaxies, which are labeled as Region 1 and Region 3. We label the galaxies in Region 1 (the “horizontal branch”) as the Pre-Main Sequence galaxies, because we believe that these galaxies are hot star-forming galaxies that are cooling into the Star-Forming Main Sequence, which we have labeled as Region 3. We have allocated Region 2 as a transition phase between the Pre-Main Sequence galaxies and the Star-Forming Main Sequence galaxies. Region 5 contains the quiescent galaxies and Region 4, similarly to Region 2, is a transition phase between the Star-Forming Main Sequence galaxies and the quiescent galaxies.
3. COSMOS Dataset
While observing the spectra of galaxies is likely the most accurate method of studying them, observing a high-resolution spectrum of a galaxy can be expensive and time-consuming. Thus, many astronomers use photometry to study galaxies, which measures the total amount of light emitted by a galaxy in certain “bands” of wavelengths . While observing a high-resolution spectrum of a galaxy can be a long and expensive process, it is much quicker and cheaper to measure the amount of total electromagnetic radiation that the galaxy emits in wider “bands”. It is generally accepted that photometry is an accurate method of studying galaxies because it is unlikely that galaxies emit light in completely arbitrary combinations of wavelengths . Thus, we can model a galaxy as if it is a nearby galaxy if we are able to observe that it emits light in certain “bands”. Previous studies of galaxies using photometry have been able to accurately deduce their properties, so we have evidence that photometry is a valid and efficient method of studying galaxies .
The COSMOS dataset of galaxies contains the “band” measurements required to conduct a photometric study. It catalogs about one million galaxies and twenty-seven photometric bands for each galaxy, making it one of the largest multi-wavelength catalogs of galaxies . We see that COSMOS is a twenty-seven-dimensional dataset, where each photometric band is an “attribute” of the COSMOS dataset.
For our study, the flux measurements of the galaxies were retrieved from COSMOS2020, the most recent version of COSMOS published in 2021, while other physical properties of the galaxies were retrieved from COSMOS2015, the version of COSMOS published in 2016. The COSMOS2020 and COSMOS2015 datasets contain the same galaxies, each identified with a common unique identification number.
4. Dimensionality Reduction
Dimensionality reduction, an unsupervised machine learning technique, is the process of reducing the number of dimensions, or “attributes,” of a dataset . Rather than keeping a select number of dimensions and discarding the excess ones, dimensionality reduction algorithms create entirely new dimensions that preserve information from each of the original dimensions. Dimensionality reduction algorithms have successfully been used in previous studies to classify galaxies . There are two main reasons that dimensionality reduction is a useful tool for analyzing high-dimensional datasets.
The first reason can best be summarized as the “curse of dimensionality.” In a high-dimensional space, data points are very sparse, meaning that most data points are so sparse that they seem dissimilar . Sparsity is measured using the Euclidean distance metric, which is the standard distance metric used to compute distances between points in a high-dimensional space. Because most data points are extremely sparse, clustering algorithms produce statistically insignificant results when applied to a high-dimensional dataset . Furthermore, when distances are computed in a high-dimensional space, noise may be overemphasized since the noise may be present in many different dimensions . High-dimensional spaces also have properties that are unintuitive to human analysts, since we are more familiar with two and three-dimensional spaces . By applying dimensionality reduction to a high-dimensional dataset, we are able to overcome these issues by working with the data points in just two dimensions.
Another reason why we use dimensionality reduction is that it is hard, if not impossible, to visualize a high-dimensional dataset. In a one-dimensional graph, data points are plotted along a single line, making it very easy to visualize and compare them. In a two-dimensional graph, which is a very commonly used data visualization tool, each data point is ordered from left to right in ascending order in one attribute, and bottom to top in ascending order in the other attribute. As such, two-dimensional graphs are also very easy to visualize. A three-dimensional graph is harder to visualize than a one-dimensional or two-dimensional graph, although it is still feasible by creating an illusion of depth of perception. We may even try to use other visible attributes such as the colors or the sizes of the data points to represent values in four or five dimensions on a graph. However, there are not enough creative ways we can visualize a very high number of dimensions on a two-dimensional surface. There is clearly no feasible way to visualize a dataset like COSMOS which contains twenty-seven dimensions. Applying dimensionality reduction to these high-dimensional datasets allows us to visualize the data points on a standard two-dimensional graph.
4.1 t-SNE Algorithm
t-SNE, or t-distributed stochastic neighbor embedding, is one instance of a dimensionality reduction algorithm . The t-SNE algorithm takes as inputs the high-dimensional dataset and the intended dimensionality of the output space, and it embeds the data points into a lower-dimensional space. In the process, t-SNE preserves the nearest neighbors of each data point but does not necessarily preserve exact distances and point densities. For most applications of t-SNE, a two-dimensional output space is selected for visualization purposes. The reduced dimensions created by t-SNE, or any other dimensionality reduction algorithm, are unitless and the numerical values of data points in these dimensions are insignificant.
t-SNE also has an input parameter called “perplexity” . Perplexity can be thought of as the number of nearest neighbors that t-SNE factors in for each point when embedding the data points in a lower-dimensional output space. Choosing an optimal value of perplexity is accomplished through a heuristic approach. In particular, to choose a value for perplexity for a given iteration of t-SNE, a guess is made for the average number of relevant neighbors that each point in the dataset has. The value of perplexity can then be increased or decreased based on a qualitative observation of the map of the output space. Generally speaking, a lower value of perplexity means that t-SNE will preserve the local relationships between the data points, while a higher value of perplexity means that t-SNE will preserve the overall global structure of the data .
In order to justify why the t-SNE algorithm was used for this project, we first compare t-SNE to another commonly used dimensionality reduction algorithm: principal component analysis, or PCA. While PCA is a linear algorithm, t-SNE is a non-linear algorithm [10, 18]. PCA reduces the dimensions of a dataset into n dimensions by creating a basis of n orthonormal vectors of the original values and transforming each of the values into a linear combination of the n basis vectors. On the other hand, t-SNE uses a cost function called KL divergence to create a clustering of the data points in n dimensions by minimizing the cost between all of the points [26, 19].
KL divergence, or Kullback-Leibler divergence, calculates the relative entropy between two probability distributions, where relative entropy is the measure of how a probability distribution differs from another probability distribution [26, 19]. For two continuous probability distributions P and Q defined over an interval [a, b], the relative entropy between P and Q, denoted by , is defined as:
where p and q are the probability density functions of P and Q respectively.
4.2 Applying t-SNE to MNIST
We briefly study the application of t-SNE to the popular MNIST dataset to illustrate how t-SNE works. MNIST is a dataset of 42,000 images of digits from 0 to 9, where each image is 28 pixels across and 28 pixels down . This means that each image contains total pixels. Each pixel can be considered an attribute, or “dimension”, of the dataset, which makes MNIST a 784-dimensional dataset. For the previously detailed reasons, we apply the t-SNE dimensionality reduction algorithm to cluster and map the images in a two-dimensional output space.
Non-linear dimensionality reduction algorithms like t-SNE can cluster the images in MNIST more effectively than linear dimensionality reduction algorithms such as PCA. This is because an image of a digit composed of a grid of pixels cannot be logically represented by PCA as a linear combination of basis vectors, unlike other objects such as words or numbers. Thus, PCA is unable to accurately cluster images of the same digit together, while t-SNE is able to do so.
As Figure 3 shows, for the MNIST dataset, focusing on the local structure of the clusters by choosing a low perplexity value yields the most accurate and well-defined clusters. This is an unexpected result because MNIST is quite a large dataset, and focusing on global structure tends to work more effectively for large datasets. However, we can reason that a low perplexity value worked well for MNIST because each image only has a few other images that look similar to it.
As seen in Figure 4, t-SNE has accurately clustered together images of the same digit in a two-dimensional space. The way in which t-SNE arranges the distinct clusters of digits is also significant. Consider the green cluster of images of the digit ‘3’ in the middle of the map. An image of a ‘3’ on the left side of the cluster, nearer to the orange cluster of images of the digit ‘1’, is more vertical—resembling a ‘1’. On the other hand, an image of a ‘3’ on the right side of the cluster, nearer to the purple cluster of images of the digit ‘8’, is more round and more closely resembles an ‘8’. Even across clusters, t-SNE places data points near other similar points, which is useful in other real-world datasets such as COSMOS that lack well-defined clusters.
5. Applying t-SNE to COSMOS
For the purposes of our study, we worked with a subset of the COSMOS dataset containing only the galaxies that have an IMF temperature assigned by the IMF fitting method. This was done because we need the IMF temperature of a galaxy to label it with one of the regimes of our three-regime model of galaxy evolution. This subset of the COSMOS dataset that we used in our study contains 97,217 distinct galaxies, which is about 9.72% of the entire COSMOS dataset.
Preprocessing a dataset before applying an algorithm such as t-SNE facilitates more accurate and efficient data analysis . In our analysis of COSMOS, we applied two preprocessing techniques: removing erroneous data and feature scaling (normalization).
The COSMOS dataset contains erroneous data in the form of nonmeasurements and nondetections, where some galaxies in the dataset have erroneous or no measurements in some or all of the photometric bands. In COSMOS, the value -99 denotes a nonmeasurement or nondetection for a galaxy in a certain photometric band. By removing the roughly 8,000 galaxies in our subset with flux values of -99 in any photometric band, we were able to create a more accurate output map.
Normalizing the COSMOS dataset was a critically important step because the galaxies are at different distances, which means flux measurements alone cannot be used to compare the luminosities of galaxies at different distances. Consider this example: Galaxy X has flux measurements of 1 in Photometric Band A and 1 in Photometric Band B, Galaxy Y has flux measurements of 1 in Band A and 2 in Band B, and Galaxy Z has flux measurements of 4 in Band A and 4 in Band B. However, Galaxy X and Y are at the same distance away while Galaxy Z is at half the distance of the other two galaxies. According to the inverse square law, Galaxy X and Galaxy Z actually have the same luminosity in both bands. However, our dimensionality reduction algorithms would consider Galaxy X and Galaxy Y to be more similar than Galaxy X and Galaxy Z because the Euclidean distance between them in these two measurements is smaller (1 between Galaxy X and Galaxy Y; between Galaxy X and Galaxy Z). Thus, to fix this bias, we normalize all of the measurements to one band. That is, all of the flux measurements are scaled such that every galaxy has the same flux value in one of the bands, and the measurements in the other bands are scaled accordingly. In this example, Galaxy X would be scaled to have flux measurements of 1 in Band A and 1 in Band B, Galaxy Y would be scaled to have flux measurements of 1 in Band A and 2 in Band B, and Galaxy Z would be scaled to have flux measurements of 1 in Band A and 1 in Band B.
After then applying t-SNE to the preprocessed COSMOS dataset, we found that the primary attribute that influenced the clustering of points was redshift. This was because two similar objects would look completely different at different redshifts because the flux measurements at one set of wavelengths for one galaxy would all be redshifted to a new set of wavelengths for the other galaxy. Due to this, we created ranges of redshifts and created t-SNE plots of galaxies in each of these bands. The redshift ranges range from 0.5 to 1.5 and are of size 0.1, meaning that the first redshift range contains galaxies from redshift 0.5 to 0.6, the second redshift range contains galaxies from redshift 0.6 to 0.7, and so on. Each galaxy in one of these ranges has values of 1+z, which is 1 added to the redshift value, within 10% of each other.
6. Results of t-SNE on COSMOS
After experimenting with different values of perplexity, we used a perplexity value of 40 to generate our final two-dimensional t-SNE maps. This is a moderately high value of perplexity.
Figure 5 shows ten two-dimensional t-SNE maps, with each containing the galaxies in COSMOS in the labeled redshift range of size 0.1. In each of these maps, it is apparent that there is a gradual transition from galaxies in Region 1, which are the Pre-Main Sequence galaxies, to galaxies in Region 3, which are the Main Sequence Galaxies, and then to galaxies in Region 5, which are the quiescent galaxies. Galaxies in Regions 2 and 4, which were the transitions, also seem to be properly located on the maps so that the galaxies are arranged roughly sequentially from Region 1 up to Region 5.
We see that in each of the ten t-SNE maps, there appears to be a small cluster of galaxies that are separated from the main, large cluster of galaxies. By coloring the t-SNE maps using various physical attributes of the galaxies, we have identified the galaxies in these small clusters to be galaxies with high amounts of dust emission. Even within these small clusters, we see that galaxies are arranged roughly sequentially by region.
To more clearly illustrate this sequential ordering, we also plotted hexbin plots for the same ten redshift ranges. Each bin is colored by the modal regime, which is the regime that is most frequently represented in that bin. In other words, the mode was the function used to reduce all of the values in a bin to a single value. The ten hexbin plots are displayed in Figure 6, which clearly display the sequential ordering of the regimes.
7. Conclusion and Outlook
The two-dimensional t-SNE maps and the hexbin plots cluster together galaxies of the same regime, meaning that pre-main sequence galaxies are most similar to other pre-main sequence galaxies, main sequence galaxies are most similar to other main sequence galaxies, and quiescent galaxies are most similar to other quiescent galaxies. We know this based on our observation that t-SNE tends to cluster together data points that are similar to each other.
We observe that the clusters of regimes are arranged in sequential order, from pre-main sequence galaxies, to the main sequence galaxies, to the quiescent galaxies, with the two clusters of “transition” galaxies also being placed correctly between the three main clusters. As demonstrated with the application of t-SNE to MNIST, we know that t-SNE places data points near other similar points even across clusters. Using this knowledge, we see that t-SNE is indicating that galaxies in one region of our maps are more similar to galaxies in directly preceding or succeeding regions than to galaxies in other regions.
We conclude that the t-SNE maps illustrate and provide evidence for our three-regime model of galaxy evolution, where a hot, star-forming galaxy cools into the Star-Forming Main Sequence, which then transitions into a quiescent galaxy when its blue, high-mass stars die out and star formation ceases.
It is clear that dimensionality reduction is a useful tool that can be used to uncover realities in datasets that are not discernible through mere observation. Specifically, in the field of astronomy, applying t-SNE or other dimensionality reduction algorithms to datasets of other celestial objects such as stars and galaxy clusters may allow us to discover new groups and classifications.
First, I would like to thank my mentor, Professor Charles L. Steinhardt, for his guidance throughout this research project. I would also like to thank Thomas H. Clark, Andrei C. Diaconu, and the faculty at the Cosmic Dawn Center for their support. This project would not have been possible without the generous funding contributed by Mr. and Mrs. Hassenzahl.
Armstrong D. P., Mccarthy M. A., 2007, Avian Conservation and Ecology, 2
Affordable housing is becoming a rapidly growing problem in all the major residential hubs worldwide. This problem has three major contributing factors: high population growth, limited supply of raw materials, and high transportation costs. This has led to housing becoming a luxury rather than a basic human need. Our study aims to analyse structural columns designed using recycled metals and readily available granular materials like sand. This will reduce the overall material cost and the cost associated with transporting bulky materials like steel and aggregate.
In order to understand the range of loads that these columns have to bear, we model the most basic of residential structures. As per the theory given in literature for solid mechanics and failure of soil, we derive the uniaxial strength equation for the proposed column designs. Now using the bearing load values, we try to validate whether these columns are practically feasible for construction purposes. Our study shows that if we use recycled metals or 3D printed materials like Poly Lactic Acid (PLA) for the confining annulus, these columns can indeed be used for civil construction. Finally, we focus on the practical considerations and geometric proportions of the columns for practical use on field.
Author: Utkarsh Gupta Indian Institute of Technology Delhi Mentors: Jose E. Andrade, Siavash Monfared California Institute of Technology Editor: Ann Zhu
Concrete and steel have been used for civil construction since time immemorial. Al- though they are great for building structures from a strength-based perspective, they are bulky to transport, expensive, not readily accessible in remote locations, and not very sustainable.
A lot of ongoing research focuses on exploring materials that are environment friendly and cheap to replace concrete and steel in Reinforced Concrete (RCC) structures. Hector Archila et al.  explored the idea using bamboo as a replacement for reinforcing steel in RCC structures. But certain studies show that it is an ill-considered concept and not a sustainable alternative for steel  due to various strength, durability and economic reasons. P. Lalzarliana et al.  explored the idea of using waste plastic bottles filled with Recycled Concrete Aggregates, aggregates processed from construction and demolition waste, to replace bricks in masonry structures. These solutions may be sustainable, but do not eliminate the biggest challenge in the field of civil construction, which is transportation of bulky raw material.
With more and more innovation in the field of 3D printing and material science, it has now become possible to manufacture and realise complex, high-strength designs on the site of construction. This greatly reduces the need for any transportation of bulky materials from one place to another. In this study, we analyse columns made using granular material in-filled inside a confining annulus. For the in-fill, we assume a granular material like sand, which is not only cheap, but also readily available in remote locations. For the confining material, we propose to use recycled metals such as steel or aluminium, or a suitable 3D printed material. Given these properties, the composite columns should be considerably more cheaper and environmentally friendly than the current RCC columns used in concrete structures.
Using three separate theories: Jaky’s Theory , Theory of Solid Mechanics, and Bell’s Addition to Rankine’s Theory of Lateral Earth Pressure [5, 6], we find the ratio of applied vertical stress on the granular material to the lateral stress on the confining annulus. The yield strength of composite column, fycc is then derived using all three theories in the subsequent sections. A closer look at the results from these theories shows that there exists a correlation between them. This is where we introduce the dimensionless variable, Soil Characterisation Constant, S◦. After this, we move towards understanding the physical significance of each of the terms in the derived equation. Towards the end, we use the Buckingham π Theorem  to derive the π constants for the derived yield strength equation. In order to validate the columns for use in practical scenarios, we also consider three to four story residential structures and calculate the loads on their columns. Then, we check whether our designed columns can sustain those loads while staying inside the limits of reasonable geometric proportions.
2. Setup and Assumptions for Analysis
In order to analyse the composite columns, we assume a suitable design for the composite column that would help us carry out the numerical analysis. Figure 1 depicts a rough diagram of the same.
Here, σapp is the uni-axial force that will be applied on the top cap and directly be transferred to the granular in-fill below. ’a’ and ’b’ are the inner and outer radius of the annulus respectively. h◦ is the height of the granular in-fill before any loading is applied on the setup.
In order to simplify the analysis procedure, we make certain assumptions based on the available literature on uni-axial compression of granular materials.
During the loading range under consideration, we assume that the granular media behaves in a Linear Elastic manner .
Under the considered loading range, we assume that there is no considerable crushing of the particles of the granular media .
We ignore any friction between the particles of the granular media and the inner wall of the confining annulus (μ ≈ 0) .
We ignore any hoop strains or radial strains that may be generated in the con- fining annulus (εθ ≈ 0) .
Material properties for the granular media and confining material have been assumed as per the stress-strain plot shown in Figure 2.
3. Vertical and Angular Stress in the Confining Annulus
The ultimate computational goal of this study is to derive an equation for the yield strength of the composite column. For this, we consider the boundary stress conditions where it is at its maximum, i.e the inner wall at the bottom of the container.
We know that mass of the in-fill granular material will remain constant throughout the compression cycle. We use this to find out the final density of the granular material as a function of the distance the top cap moves down into the annulus. Let that distance be ∆, as shown in Figure 3.
Therefore we can say that:
The vertical stress, σv, at a distance x below the top line can written as,
Where σapp is the uni-axial vertical stress that is applied on the top cap of the composite column. Substituting Equation (1) in (2),
According to the above equation, σv is directly proportional to x, i.e. the depth at which we are calculating the vertical stress. For maximum vertical stress, x = h◦. Therefore,
Equation (4) gives the maximum vertical stress at the bottom of the container as a function of the applied vertical stress on the top cap of the composite column. Using stress analysis, we can find out the angular stress in the wall of the annulus. Note that this stress will be tensile in nature.
Figure 4 shows the pressure distribution on the annulus wall. Using these notations, angular stress, σθ can we written as,
From Equation (5), we observe that σθ is inversely proportional to the radial distance from the center of the annulus. Therefore, the maximum angular stress will be on the inner wall of the annulus, i.e. at r = a.
Using Equations (4) and (6), we derive the yield strength equation for the composite column as per the various lateral earth pressure theories mentioned in Section 1.
4. Yield Strength of Composite Column using Jaky’s Theorem, Theory of Solid Mechanics and Rankine’s Theory of Lateral Earth Pressure
We have already computed the vertical stress at the bottom of the annulus using the theory of mechanics. But in order to translate this vertical stress into lateral stress, we cannot employ these simple theories as they do not apply to granular materials like sand. Therefore, we use different lateral earth pressure theories given in the literature (Jaky’s Theorem, Rankine’s Earth Pressure Theory). To explore all avenues, we have also formulated the yield strength equation using the theory of solid mechanics. The theory and derivation using each of these theories has been discussed in the subsequent sections.
4.1 Jaky’s Theorem
Jaky’s Theorem states that the ratio of the lateral pressure to vertical pressure, K◦ for cohesionless soils in at-rest condition can be given by the following equation.
Using this we can say that
4.2 Theory of Solids Mechanics
Suppose we assume that the in-fill granular material is linearly elastic, homogeneous, and isotropic in nature. Then we can say that,
Egm = Young’s Modulus of the In-Fill Granular Material, νgm = Poisson’s Ratio of the In-Fill Granular Material
As per the assumptions listed in Section 2, we can say that εxx ≈ 0 and εyy ≈ 0. Due to radial symmetry in loading, we can say that σxx = σyy = σh. Now using this in either of Equation (14) or (15), we can write
4.3 Rankine’s Theory of Lateral Earth Pressure
Rankine developed the theory of active and passive earth pressure in 1857 for cohesionless soils . Using this, he gave the coefficients of active and passive earth pressure.
Ka = Coefficient of active earth pressure, Kp = Coefficient of passive earth pressure, φ’ = Effective angle of friction of the granular material
For soils with cohesion, Bell developed an analytical solution using the square roots of the pressure coefficient developed by Rankine, to predict the contribution of the cohesion part of the soil to the overall lateral pressure.
σv = Vertical stress, σa = Lateral stress in active case, σp = Lateral stress in passive case, c ‘= Effective cohesion of granular material
As per our assumed setup, we can say that the deformation in the annulus will be approximately as shown in Figure 5.
We can clearly observe that our case is only concerned with the active earth pressure coefficient. So we can write
4.4 Derivation for Yield Strength of Composite Column
We have already derived the maximum vertical stress at the bottom of the composite column in Equation (4). Using Equations (8), (12), (17), we translate that vertical stress into lateral stress on the confining annulus. Then using Equation (6), we can compute the angular stress on the inner wall of the confining annulus. Now in order for the column to not fail, σθ,max ≤ fycm. This equation will have the σapp term. At the critical case, σapp can be written as σapp,max, and this σapp,max = fycc. Using this computation procedure, we find out that the yield strength of the composite column can be given as,
Although we have used three separate theories to formulate the yield strength equations, on close observation, it can be seen that these equations are not that different from each other. The fundamental part of the equation remains same in all three. In the subsequent sections, we combine these three into a single equation by introducing the Soil Characterisation Constant, S◦.
5. Towards a Combined Yield Strength Equation
5.1 Combined Yield Strength Equation Expression
Observing Equation (18) carefully, we see that in each of these equations, there is dimensionless term that characterises the in-fill granular material. Let us call that constant as the Soil Characterisation Constant, S◦. Using this constant, we can combine the three equations as,
5.2 Understanding the Combined Yield Strength Equation
If we take a closer look at Equation (19), we observe that the equation can be split into four separate parts.
Each component of the equation adds or subtracts to the yield strength of the composite column
(1) Part 1: This is the strength that the column gains from the material properties of the confining material (fycm: Yield Strength of the Confining Material) and the geometry of the annulus (a: Inner Radius, b: Outer Radius), and the Soil Characterisation Constant, S◦.
(2) Part 2: This is the strength that the column gains because of the outer confining pressure. We see that the higher the outer confining pressure is, the higher the strength of the column will be, which also makes sense intuitively.
(3) Part 3: Here we see the role of cohesion of the in-fill granular media. In the simplest terms, cohesion is the tendency of the particles of a media to stick together. So, the more they stick together, the lower the lateral pressure that they apply on the confining annulus and hence, the higher the yield strength of the composite column will be. Thus, yield strength is directly proportional to the cohesion of the in-fill granular media.
(4) Part 4: This term is due to the mass of the in-fill granular media. The greater the mass, the higher the applied lateral pressure on the confining annulus, and hence, the lower the strength of the composite column will be. Thus, this term has a negative sign as the higher its value, the weaker the composite column will be.
5.3 Understanding the Soil Characterisation Constant, S◦
As per Equation (19), we have three separate values of Soil Characterisation Constant, S◦ according to the three theories that we have used for the calculating the yield strength of composite column.
In order to understand the variation of S◦, we take some arbitrary values of φ and νgm. As per the literature, we assume that the φ value can lie between 25◦ to 45◦  and the νgm value can lie between 0.25 to 0.35 . Using this range, we plot the S◦ values for the three different theories.
From Figure 6, we observe that the S◦ would lie between 0.12 to 0.6 in all practical scenarios. So now we plot the variation of Yield Strength of Composite Column, fycc against the Soil Characterisation Constant, S◦. From Figure 7, we can say that the yield strength of composite column, fycc is somewhat inversely proportional to the Soil Characterisation Constant, S◦ of the in-fill granular media. Therefore, in order to design composite columns with high yield strength, we require granular in-fill with high angle of friction, φ, and low Poisson’s Ratio, νgm (from Figure 7).
6. Comparing Yield Strength Equation Results from the Different Theories Used
Figure 8 shows the relation between fycc and fycm for the different theories that we have used to calculate the yield strength of the composite column. The plot clearly indicates that as we start using stronger materials for the confining material for the composite column, it becomes more and more critical which theory we choose to calculate the yield strength of composite column. The deviation between the yield strength given by different theories increases as the strength of the confining material increases.
One thing to note is that although the absolute difference in the yield strength may be increasing, the percentage increase remains almost the same, i.e. about 16% from Jaky’s Theory to Theory of Solid Mechanics and approximately 29% from Theory of Solid Mechanics to Rankine’s Theory. Figure 9 shows the ratio of the yield strength
of the composite column calculated using the different theories under consideration. From the graph we can clearly observe that the ratio is almost a straight line under practical scenarios. Extrapolating the straight line on the y-axis, we see that the ratio between the yield strength using Rankine’s Theory and Theory of Solid Mechanics is approximately 1.28. The same is approximately 1.50 for Rankine’s Theory and Jaky’s Theory and 1.16 for Theory of Solid Mechanics and Jaky’s Theory.
7. Deriving π-constants for the Yield Strength Equation using Buckingham π Theorem
The Buckingham π Theorem provides a method for computing sets of dimensionless parameters from the given variables, even if the form of the equation is unknown. These “variables” are known as repetitive variables and should be chosen in such a manner that they do not form a non-dimensional pair among themselves. Using this theory, we derive the π constants for the combined yield strength equation derived in Section 5.1.
Since S◦ is a dimensionless variable in all the three cases, we remove that from the equation for the Buckingham π analysis.
Now there are eight variables in the above equation: fycc, fycm, b, a, pb, M, g, c . Let us assume 3 variables as the repetitive variable, namely fycm, a, M. Using the Buckingham π Theorem, we get the π constants as:
If we take a look at π3, we know that in our study, pb = Atmospheric Pressure = 0.101 MPa. And the value of fycm is of the order of tens and maybe hundreds of MPa. Therefore the value of π3 is approximately zero, and hence can be ignored for the purpose of establishing structural similitude. Similarly for π5, we know that the cohesion for granular materials is around 0.02 MPa. Hence the value of π5 is also approximately zero and can also be ignored. Therefore, the primary π constants that will determine the structural similitude between the model and the prototype are:
The current study focuses on deriving an equation to determine the yield strength of composite columns, fycc with a granular in-fill inside a confining annulus, under a uni- axial compressive load. In order to understand the range of loads for which we need to design the columns, we prepared analytical models of common residential structures which indicate that if our columns can sustain an axial load of 20 MPa or more, they can be a viable replacement for RCC columns.
We have used three separate theories to calculate the conversion ratio of the applied vertical stress on the granular material to the lateral stress on the confining material. Towards the end, we defined a new dimensionless number, Soil Characterisation Constant, S◦ to identify the granular material that we have in-filled inside the annulus and combine the yield strength equations of different theories into one single equation. We have also derived the π constants for the yield strength equation using the Buckingham π theorem.
The theoretical calculations of this study suggest that it is possible to manufacture structural columns for low to moderately loaded buildings using the proposed design. If done successfully, these columns will not only be cheap, but also sustainable and environmentally friendly as they can be made entirely from recycled or natural materials.
9. Future Directions
In the course of this study, we have taken certain assumptions that may not be true on the field. Due to the theoretical nature of these assumptions, the natural next step would be to quantify the final Yield Strength of Composition Column empirically.
Using the π constants that we have derived in Section 5.4, an experimental study can be conducted between the model and the structural prototype of the composite columns to further validate their use in the field.
Once these columns have been validated for use in the field, it can completely transform the way we build low rise affordable housing structures. Given the focus on reducing the transportation needs associated with civil construction, these columns can also be used for the construction of extraterrestrial structures.
I would like to thank Prof. Jose E. Andrade for providing me this opportunity to work under his guidance as a Summer Undergraduate Research Fellowship at California Institute of Technology. This article and the research behind it would not have been possible without his constant guidance and support. I would also like to thank Siavash Monfared who constantly helped me in overcoming any challenges that I faced during the course of my research and showed me the path wherever I got stuck. Lastly, I would like to thank all the fellow graduate and undergraduate students who I met during this fellowship. Their advice and direction was also a contributing part of my research work.
Mukul Gupta, Suresh Bhalla, Diwakar Bhagat, Roger P. West, and Aarti Nagpal. “Pre-Engineered Construction Using Bamboo Portal Frame”. In: 9th biennial conference on structural engineering convention (SEC 2014) (2014).
Hector Archila, Sebastian Kaminski, David Trujillo, Edwin Zea Escamilla, and Kent A. Harrires. “Bamboo Reinforced Conrete: A Critical Review”. In: Material and Structures 51.102 (2018).
P. Lalzarliana, Paihte Andy, C. Lalngaihawma, and GauravSaini. “Recycled Aggregate filled waste plastic bottles as a replacement of brick”. In: Materials Today 15.3 (2019), pp. 663–668. doi: https://doi.org/10.1016/j.matpr. 2019.04.135.
Jaky J. “Pressure in silos”. In: Proc., 2nd Int. Conf. on Soil Mech. and Found.Engrg. 1 (1948), pp. 103–107.
Rankine W. “On the stability of loose earth”. In: Philosophical Transactions of the Royal Society of London 147 (1856), pp. 103–107.
Bell A. L. “The lateral pressure and resistance of clay and the supporting power of clay foundations”. In: Minutes Proc. Inst. Civ. Eng. 199 (1915), pp. 233–272.
E. Buckingham. “On Physically Similar Systems; Illustrations of the Use of Dimensional Equations”. In: Phys. Rev. 345. doi: https://doi.org/10. 1103/PhysRev.4.345.
Vijay Krishnan Subramanian. “Quasi-Static Compression of Granular Materi- als(Sand) at High Pressure”. In: Master of Science in Mechanical Engineering Thesis.
M. M. Hagerty, D. R. Hite, C. R. Ulrich, and D. J. Hagerty. “One Dimensional High Pressure Compression of Granular Media”. In: ().
Kumar R., Bhargava K., and Choudhury D. “Estimation of Engineering Properties of Soils from Field SPT Using Random Number Generation”. In: INAE Letters 1 (2016), pp. 77–84. doi: https://doi.org/10.1007/s41403-016- 0012-6.