AI as a learning resource

Understanding predictions and Verifying responses

John Little

Duke University Libraries

Center for Data & Visualization Sciences

2024-12-03

AI and new challenges

  • Introduction of Generative LLMs (e.g., ChatGPT)
    • Translation
    • Synthesis
  • New challenges:
    1. How to ask questions of a generative AI
    2. How to frame questions to reflect goals

The Confidence v Competence Paradox


  • LLMs give confident responses
  • Responses are predictions, not necessarily correct answers
  • Incorrect predictions = “hallucinations”
  • Verification is crucial
  • Paradox: More knowledge leads to better evaluation of AI responses

Use case - Code generation

  • Data transformation
  • Data analysis
  • Iteration
  • Big Data
  • AI assistance / AI-paired coding

Goal

Create scatter plots, one for each home world

Case Study - Star Wars Dataset

Homeworld Heights Masses Characters
Tatooine 172, 188, 178 77, 84, 120 Luke Skywalker, Anakin Skywalker, Owen Lars
Alderaan 150, 191 49, 85 Leia Organa, Bail Prestor Organa
Naboo 165, 196, 170 45, 66, 75 Padmé Amidala, Isadore, Palpatine
Coruscant 66, 188 17, 84 Yoda, Mace Windu

Example

Challenges in AI Assistance

  • AI can handle well some basic visualization and coding
  • Struggles with complex data shaping and iteration
  • This problem is easier when the user has knowledge in:
    • Coding concepts
    • Data shaping
    • Visualization
    • Iteration for large datasets

When it goes wrong

Word problems

Prompt: Inconsistent AI responses for “How long does it take to walk 10,000 steps on a treadmill at 1.2 MPH?”

  • Lesson 1: Importance of cross-verification
  • Lesson 2: Prediction is not the same as mathmatical truth

EEBO

No ground truth

Code

Translation done poorly

  • Due to insufficient background and/or prompting

AI-paired code generation

  • Some clear winners and losers in the big names. aka each LLM has it’s own evolving strengths, weaknesses, and tendencies.


These problem highlights the Competence v Confidence Paradox but are easily verifiable

When it goes right

and how right does it go?

Synethtic questions

Prompt:   Compare student body and faculty diversity at Duke University with UNCG. Compare today with 1985.

  • Lesson 1: Different LLMs give different amounts of evidence for verification
  • Lesson 2: Differing amounts of ground truth will affect the prediction

Code translation

I have Python code, give it to me in R

Variations in code translations

  • R to Python
  • Python to R
  • SQL from natural language
  • javascript
  • HTML

Natural language

 

How can I use the phrase “Sticky Wicket” in German?

  • Translate Sticky Wicket to German
  • But how to verify (same as code problem)

Value in Reproducibility

  • Coding
    • Do everything with code
    • Including report generation
  • No Code
    • Getting better all the time


Increasingly we are seeing computation environments with build-in AI-pairing

Solutions

and best practices

Problems and Solutions

  • GIGO (Garbage In, Garbage Out) still applies
  • Prompt engineering is a crucial skill
  • AI excels in translation tasks
  • Good for synthetic questions with possible validation
  • Less reliable for tasks without established ground truth

Best Practices

Using Broad-base LLMs:

  • ChatGPT
  • Microsoft Copilot
  • Claude.ai
  • Gemini.google.com
  • GitHub Copilot (for AI-paired coding)

Prompt Engineering

  • Identify role
  • Identify audience
  • Identify voice
  • Identify goals and problem
  • Use multiple steps
  • Verify

Conclusion

Embracing AI in data analysis

  • AI is a powerful tool, but requires careful use
  • The library offers crucial guidance
  • Continuous learning and adaptation are essential

Questions

  1. How do you see these tools or techniques impacting research and research investment?
  2. Do you have data transrormation, reshaping, or analysis tasks that could benefit from AI assistance?
  3. In what ways do you think we can improve training and assistance for next generation LLMs?
  4. What are some of the biggest challenges you see in the future of AI-paired coding?