Google Наука

J Ji, T Qiu, B Chen, B Zhang, H Lou, K Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

AI alignment aims to make AI systems behave in line with human intentions and values. As
AI systems grow more capable, the potential large-scale risks associated with misaligned AI …

Запазване Позоваване С позовавания в 132 Сродни статии Всички 3 версии Във вид на HTML

[PDF] thecvf.com

Muter: Machine unlearning on adversarially trained models

J Liu, M Xue, J Lou, X Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract Machine unlearning is an emerging task of removing the influence of selected
training datapoints from a trained model upon data deletion requests, which echoes the …

Запазване Позоваване С позовавания в 15 Сродни статии Всички 4 версии Във вид на HTML

[PDF] arxiv.org

Salun: Empowering machine unlearning via gradient-based weight saliency in both image classification and generation

C Fan, J Liu, Y Zhang, D Wei, E Wong, S Liu - arXiv preprint arXiv …, 2023 - arxiv.org

With evolving data regulations, machine unlearning (MU) has become an important tool for
fostering trust and safety in today's AI models. However, existing MU methods focusing on …

Запазване Позоваване С позовавания в 36 Сродни статии Всички 4 версии Във вид на HTML

[PDF] arxiv.org

Citation: A key to building responsible and accountable large language models

J Huang, KCC Chang - arXiv preprint arXiv:2307.02185, 2023 - arxiv.org

Large Language Models (LLMs) bring transformative benefits alongside unique challenges,
including intellectual property (IP) and ethical concerns. This position paper explores a …

Запазване Позоваване С позовавания в 29 Сродни статии Всички 4 версии Във вид на HTML

[PDF] arxiv.org

Datainf: Efficiently estimating data influence in lora-tuned llms and diffusion models

Y Kwon, E Wu, K Wu, J Zou - arXiv preprint arXiv:2310.00902, 2023 - arxiv.org

Quantifying the impact of training data points is crucial for understanding the outputs of
machine learning models and for improving the transparency of the AI pipeline. The …

Запазване Позоваване С позовавания в 25 Сродни статии Всички 3 версии Във вид на HTML

[PDF] arxiv.org

Intriguing properties of data attribution on diffusion models

X Zheng, T Pang, C Du, J Jiang, M Lin - arXiv preprint arXiv:2311.00500, 2023 - arxiv.org

Data attribution seeks to trace model outputs back to training data. With the recent
development of diffusion models, data attribution has become a desired module to properly …

Запазване Позоваване С позовавания в 13 Сродни статии Всички 3 версии Във вид на HTML

[PDF] arxiv.org

Contextual Confidence and Generative AI

S Jain, Z Hitzig, P Mishkin - arXiv preprint arXiv:2311.01193, 2023 - arxiv.org

Generative AI models perturb the foundations of effective human communication. They
present new challenges to contextual confidence, disrupting participants' ability to identify …

Запазване Позоваване С позовавания в 5 Сродни статии Всички 2 версии Във вид на HTML

[PDF] openreview.net

Adaptive instrument design for indirect experiments

Y Chandak, S Shankar, V Syrgkanis… - The Twelfth International …, 2023 - openreview.net

Indirect experiments provide a valuable framework for estimating treatment effects in
situations where conducting randomized control trials (RCTs) is impractical or unethical …

Запазване Позоваване С позовавания в 5 Сродни статии Всички 3 версии Във вид на HTML

[PDF] arxiv.org

Merging by matching models in task subspaces

D Tam, M Bansal, C Raffel - arXiv preprint arXiv:2312.04339, 2023 - arxiv.org

Model merging aims to cheaply combine individual task-specific models into a single
multitask model. In this work, we view past merging methods as leveraging different notions …

Запазване Позоваване С позовавания в 6 Сродни статии Всички 2 версии Във вид на HTML

[PDF] arxiv.org

Structured inverse-free natural gradient: Memory-efficient & numerically-stable kfac for large neural nets

W Lin, F Dangel, R Eschenhagen, K Neklyudov… - arXiv preprint arXiv …, 2023 - arxiv.org

Second-order methods for deep learning--such as KFAC--can be useful for neural net
training. However, they are often memory-inefficient and numerically unstable for low …

Запазване Позоваване С позовавания в 2 Сродни статии Всички 6 версии Във вид на HTML

Създаване на сигнал

Позоваване

Разширено търсене

Запазено в „Моята библиотека“

Studying large language model generalization with influence functions

Ai alignment: A comprehensive survey

Muter: Machine unlearning on adversarially trained models

Salun: Empowering machine unlearning via gradient-based weight saliency in both image classification and generation

Citation: A key to building responsible and accountable large language models

Datainf: Efficiently estimating data influence in lora-tuned llms and diffusion models

Intriguing properties of data attribution on diffusion models

Contextual Confidence and Generative AI

Adaptive instrument design for indirect experiments

Merging by matching models in task subspaces

Structured inverse-free natural gradient: Memory-efficient & numerically-stable kfac for large neural nets