Large-Scale Analysis of Galaxy Spectra

  • Problem: To process and analyze a massive dataset of over 100,000 galaxy spectra to extract key physical characteristics.
  • Skills & Tools: Bayesian Inference, MCMC, Statistical Modeling, Python (NumPy, Pandas), Data Pipelines, Matplotlib, Seaborn.
  • Process:
    • Built end-to-end Python pipelines for efficient data cleaning, feature extraction, and visualization of the spectral data.
    • Applied advanced Bayesian methods, Markov Chain Monte Carlo (MCMC), and both parametric/non-parametric models to analyze the processed data.
  • Outcome: Successfully processed and modeled a large-scale dataset, enabling the extraction of meaningful physical parameters for over 100,000 galaxies.