P2, S1: Second pod discussing the research paper - AI Alignment: A Comprehensive Survey (2024). Key areas explored include learning from feedback, addressing distribution shifts, assurance techniques like safety evaluations and interpretability, and governance practices by various stakeholders. The text emphasizes the socio-technical aspects of AI alignment, highlighting the evolving challenges as AI becomes more integrated into society.

2-Navigating_AI_Alignment__Risks,_Research,_and_the_Future_of_Human_Control

AI Alignment: what it is, why it misalignment happens, and steps to prevent it

3-AI_Alignment__Preventing_Existential_Risk_from_Superintelligent_Machines

The ASIC (Aligned Sovereign Intelligence Core) framework, governance approaches and technical solutions are deeply intertwined and mutually reinforcing, forming a comprehensive strategy for AI alignment

4-ASIC__Engineering_a_Governed_Future_for_AI_-_From_National_Sovereignty_to_Global_Trust

This podcast discusses hallucinations and the Guardrail AI open API and its use cases

5_Decoding_AI_Hallucinations__How_Guardrails_AI_Keeps_LLMs_Honest_and_Safe

LLMs are Bayesian, In Expectation, Not in Realization, Leon Chlon Hassana Labs (2025). This paper provides a framework for understanding statistical behavior of transformers, demonstrating that their architectural "biases" can actually be features that enhance efficiency The proposed solutions offer practical ways to leverage this understanding for better uncertainty quantification.

6-The_Order_Paradox_How_LLMs_Break_Bayesian_Rules

The ASI Institute Podcast - we cover all things surrounding aligned AI: robotics, security,  model training, and user behavior too !

2-Navigating_AI_Alignment__Risks,_Research,_and_the_Future_of_Human_Control

ASI Institute

Related tracks