Search: alignmentforum.org | Heykuki News

HK

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

31.

How truthful is GPT-3?

alignmentforum.org

5 years ago

2 points

32.

The Codex Skeptic FAQ

alignmentforum.org

5 years ago

2 points

33.

Opinions on Interpretable Machine Learning and 70 Summaries of Recent Papers

alignmentforum.org

5 years ago

2 points

34.

Interpreting GPT: The Logit Lens

alignmentforum.org

6 years ago

RiversHaveWings

2 points

35.

On the purposes of decision theory research

alignmentforum.org

7 years ago

2 points

36.

Reframing Superintelligence: Comprehensive AI Services as General Intelligence

alignmentforum.org

7 years ago

2 points

37.

OpenAI: Common Misconceptions

alignmentforum.org

4 years ago

1 points

38.

A Mechanistic Interpretability Analysis of Grokking

alignmentforum.org

4 years ago

1 points

39.

Embedding Spaces – Transformer Token Vectors Are Not Points in Space

alignmentforum.org

9 months ago

1 points

40.

LLMs Are Simulators

alignmentforum.org

a year ago

1 points

41.

AGI Safety and Alignment at Google DeepMind: A Summary of Recent Work

alignmentforum.org

2 years ago

1 points

42.

Mysteries of mode collapse – AI Alignment Forum (2022)

alignmentforum.org

2 years ago

1 points

43.

LLMs for Alignment Research: a safety priority?

alignmentforum.org

2 years ago

1 points

44.

Mesa-Optimization

alignmentforum.org

3 years ago

1 points

45.

alignmentforum.org

3 years ago

peter_d_sherman

1 points

46.

Concrete Open Problems in Mechanistic Interpretability

alignmentforum.org

3 years ago

1 points

47.

Gitch Tokens in GPT (SolidGoldMagikarp)

alignmentforum.org

3 years ago

1 points

48.

Central AI alignment problem: capabilities generalization and sharp left turn

alignmentforum.org

4 years ago

1 points

49.

A Mechanistic Interpretability Analysis of Grokking

alignmentforum.org

4 years ago

1 points

50.

A Mechanistic Interpretability Analysis of Grokking

alignmentforum.org

4 years ago

1 points

51.

A Mechanistic Interpretability Analysis of Grokking

alignmentforum.org

4 years ago

1 points

52.

Specificity: Brain's Superpower (2019)

alignmentforum.org

4 years ago

1 points

53.

Fixing the Good Regulator Theorem (2019)

alignmentforum.org

4 years ago

1 points

54.

Embedded Agency and AI Alignment

alignmentforum.org

4 years ago

1 points

55.

Demons In Imperfect Search (2020)

alignmentforum.org

5 years ago

1 points

56.

Another (Outer) AI Alignment Failure Story

alignmentforum.org

5 years ago

1 points

57.

Birds, Brains, Planes, and AI

alignmentforum.org

5 years ago

1 points

58.

Birds, Planes, Brains, and AI

alignmentforum.org

5 years ago

1 points

59.

Multi-dimensional rewards for AGI interpretability and control

alignmentforum.org

5 years ago

CapitalistCartr

1 points

60.

Alignment as a Bottleneck to Usefulness of GPT-3

alignmentforum.org

6 years ago

1 points