Editors Note: This blog article is an AI-supported distillation of an in-person event held in 🇨🇦 Vancouver 🇨🇦 on 2024-11-28 facilitated by
. It does not reflect the views of the facilitators, writer, or the Ai Salon - it is meant to capture the conversations at the event. Quotes are paraphrased from the original conversation.A provocative assertion opened a recent discussion among scientists, engineers, and AI researchers: if we can solve artificial intelligence, everything else might follow. This claim rests on an intriguing observation: machine learning occupies a unique position among scientific fields. While fields like biology require complex physical infrastructure and face numerous practical constraints, machine learning research can be contained entirely within computational systems, making it an ideal candidate for automation.
Yet this apparent advantage raises deeper questions about the nature of scientific discovery and whether our current approaches to AI can capture the essence of scientific thinking. The discussion revealed fundamental tensions between systematic exploration and creative insight, between brute-force computation and intuitive understanding.
👉 Jump to a longer list of takeaways and open questions
Main Takeaways
AI systems complement human capabilities while revealing fundamental limitations: they excel at processing vast information but struggle with intuition and conceptual understanding
The current "brute force" approach to hypothesis generation may require more sophisticated methods to capture breakthrough insights
Traditional academic publishing needs fundamental reimagining for an AI-augmented scientific future
Scientific AI systems must integrate pattern-matching capabilities with formal logical reasoning, mirroring human System 1/System 2 cognition.
The Nature of Scientific Discovery
AI-driven scientific discovery relies heavily on generating vast numbers of hypotheses and filtering for promising candidates. In mathematics, this approach has yielded impressive results - a 2024 DeepMind paper demonstrated how sampling millions of mathematical heuristics could produce superhuman insights.
However, this brute-force approach raises fundamental questions about the nature of scientific progress. As one researcher noted, "Even if a human scientist spent two years completely failing at three projects, getting zero papers, they would get a lot out of it. And our system currently doesn't imitate that." This observation points to a crucial distinction between human and artificial approaches to scientific discovery - humans learn from failure in ways that current AI systems cannot.
The conversation revealed a particularly nuanced perspective on scientific productivity. While current systems might be capable of automating what one participant called "mediocre" science - the kind of incremental work that fills many academic journals - this might not significantly advance human knowledge. As was provocatively stated,
"Even if we automated the bottom 99% of science, I don't think it actually matters. It's the one or two papers per decade that really moves the needle that we need to capture."
This tension between quantity and quality in scientific output connects to a deeper question about scientific intuition or "taste." Multiple participants noted how difficult it is to formalize the kind of intuition that allows experienced scientists to identify promising research directions. One researcher observed that "having that PI with good taste... it comes down to taste and it's possibly one of the hardest things in building academia."
Current AI Systems: Capabilities and Limitations
Automated drug discovery provided a concrete example of the dynamics of AI in scientific disciplines today. While AI systems can achieve impressive success rates in predicting molecular binding properties, the process still benefits significantly from human oversight. As one participant noted,
“We use the sort of data-driven machine learning approach... we'll then back it up with computational chemistry as well as sticking the eye of a trained medicinal chemist on it. Obviously you want to throw everything at the problem because it is so expensive to synthesize and test a novel compound”
This multilayered approach - combining machine learning predictions, computational verification, and human expertise - demonstrates how AI augments rather than replaces scientific judgment.
Drug discovery also exemplifies the promise and limitations of current AI systems in science. With 50% success rates in predicting molecular binding properties, these systems demonstrate significant capability in well-defined problem spaces. Success stems from clear evaluation metrics and the ability to combine multiple analytical approaches. As one researcher explained, "We use the sort of data-driven machine learning approach... we'll then back it up with computational chemistry as well as sticking the eye of a trained medicinal chemist on it."
However, the limitations of current systems became apparent when discussing more abstract reasoning tasks. Large Language Models (LLMs), while powerful, exhibit what one participant called "really egregious blind spots." The discussion of mathematical reasoning provided a concrete example: when presented with novel mathematical concepts, LLMs often fail to maintain logical consistency or generate valid counterexamples. This limitation points to a deeper issue - the difference between pattern matching and genuine conceptual understanding.
The debate around AI "agents" highlighted another crucial aspect of current systems. While some define agents broadly as systems with goals and the ability to act on their environment, others argue for more stringent criteria. As one participant noted, "You have to kind of make sure that the environment and the action space are kind of rich enough to be interesting to kind of call it an agent." This distinction becomes particularly relevant when considering scientific discovery, where the ability to interact with and modify experimental conditions is often crucial.
Reimagining Scientific Knowledge Organization
Perhaps the most provocative insights emerged around the future of scientific knowledge organization. The limitations of current scientific knowledge organization extend beyond mere inefficiency. Academic papers, as one participant argued, represent "a very fragmented, noisy, kind of crappy, diffuse way of knowing something." This fundamental critique suggests the need for entirely new approaches to organizing and advancing scientific knowledge.
Moving beyond individual papers to more integrated knowledge representations, potentially using AI to synthesize and connect findings across fields.
Developing new evaluation metrics that prioritize genuine breakthrough insights over incremental progress. As one researcher noted, "If you're optimizing for citations, you are going to aim to make the most impactful paper."
Rethinking the role of negative results. Several participants noted how current publication biases exclude valuable information about what doesn't work - information that could be crucial for AI systems learning to do science.
Integrating formal reasoning systems with natural language processing. One participant suggested that effective scientific AI might need to combine "System 1" pattern matching with "System 2" logical reasoning: "LLMs need to be married to some kind of system two to become reliable."
Practical Implications
These discussion points to a clear pattern: AI excels at tasks involving comprehensive data analysis, pattern recognition across vast datasets, and systematic exploration of well-defined possibility spaces. The immediate opportunity lies not in replacing scientists but in eliminating routine cognitive overhead. As one researcher noted, a single investment analyst working with AI can accomplish more than a team of five junior analysts, while also pursuing analyses that would be impossible for human teams.
However, this efficiency gain reveals a critical challenge for scientific institutions. The traditional apprenticeship model of science—where junior researchers develop expertise through hands-on experience with routine tasks—may need restructuring. Some participants suggested that rather than automating entry-level research tasks, AI systems could be designed to enhance the learning process itself, providing richer feedback and enabling more ambitious projects earlier in scientific careers.
The path forward appears to be emerging: develop AI systems that can handle the systematic aspects of scientific work—literature review, data analysis, hypothesis generation—while preserving human oversight of research direction and interpretation of results. This approach acknowledges both the power of AI in processing and pattern recognition and the crucial role of human judgment in identifying truly significant scientific advances.
However, a more future goal and key motivation behind the pursuit of artificial general intelligence is the belief that scalable, human-like reasoning could help tackle pressing scientific and technological challenges. These include curing diseases, discovering new materials, developing low-carbon energy systems, and making fundamental breakthroughs in physics and chemistry. The real bottleneck for progress is a specialized kind of intelligence—one AGI aspires to replicate. Although today’s AI systems largely serve as extensions of human intellect, the question remains whether AGI will eventually match and even automate the scientific discovery process.
Notes from the conversation
Machine learning is currently more approachable for AI automation compared to other scientific fields due to its computational nature and containment within computer systems
Physical sciences face additional challenges for automation due to the need for real-world experimentation and expensive equipment
The distinction between "on-rails" and open-ended automation is crucial - structured tasks are easier to automate but may limit novel discoveries
Current AI systems can achieve ~50% success rate in drug discovery when predicting molecular binding properties
There's ongoing debate about the definition of AI "agents" - ranging from simple goal-directed systems to more complex autonomous entities
The concept of agency exists on a spectrum rather than being binary
Current AI systems excel at complementing human capabilities (memory, processing speed) rather than replacing human intuition
The "brute force" approach of generating many hypotheses and filtering for promising ones shows some success but may be inefficient
Scientific intuition and "taste" are difficult to encode in AI systems but crucial for meaningful discoveries
The academic paper format may not be the optimal way to organize and advance scientific knowledge
Current LLMs struggle with consistent logical reasoning and conceptual understanding
The distinction between System 1 (fast, intuitive) and System 2 (slow, analytical) thinking is relevant for AI architecture
There may be fundamental limitations to knowledge that can be encoded purely in language
The integration of formal reasoning systems with LLMs could address current limitations
Scientific progress often comes from serendipitous discoveries that are hard to plan for
The current academic incentive structure may not optimize for meaningful scientific advancement
AI could help democratize access to scientific knowledge and research capabilities
There's tension between automation and maintaining the pipeline for training future scientists
The goal of AI in science should perhaps focus on augmenting rather than replacing human scientists
The future of AI in science may require new architectures beyond current LLM approaches
Questions
How do we balance the efficiency of automated research with maintaining the human training pipeline?
Can scientific intuition be fully encoded in AI systems, or are there fundamental limitations?
Should we optimize for generating more papers, or focus on fundamental breakthroughs?
How do we evaluate the "interestingness" or novelty of AI-generated hypotheses?
Can language models truly understand concepts without physical embodiment?
How do we maintain scientific rigor while increasing research automation?
What is the appropriate balance between exploration and targeted research?
How do we incorporate negative results into AI training data?
Can AI systems develop genuine scientific curiosity?
Should we focus on augmenting human scientists or creating autonomous AI scientists?
How do we preserve serendipitous discovery in automated systems?
What is the role of peer review in an AI-augmented scientific process?
How do we ensure AI systems can recognize truly novel insights?
Can AI develop scientific taste without human guidance?
How do we balance computational efficiency with thorough exploration?
What is the appropriate metric for evaluating AI-generated research?
How do we integrate embodied knowledge with language-based systems?
Can AI systems develop genuine understanding versus pattern matching?
How do we maintain diversity of thought in automated research?
What is the optimal architecture for scientific AI systems?