eprintid: 36417
rev_number: 15
eprint_status: archive
userid: 8572
dir: disk0/00/03/64/17
datestamp: 2025-04-17 09:37:52
lastmod: 2025-04-23 12:14:31
status_changed: 2025-04-17 09:37:52
type: conferenceObject
metadata_visibility: show
creators_name: Partanen, Julian
creators_name: Everling, Markus
creators_name: Kempf, Dominic
creators_name: Knoth, Tim
creators_name: Böhm, Don
creators_name: Kepper, Nikolaus
creators_name: Baumann, Martin
creators_name: Haller, Alexander
title: Introducing Project-W: A self-hostable platform for OpenAI’s Whisper
divisions: i-704000
pres_type: poster
abstract: Speech-to-text technologies, driven by advancements in artificial intelligence, are increasingly beneficial to sectors like research and education. These systems enable the transcription of vast audio data, making it easier to process, analyze, and archive information. However, concerns over data privacy and reliance on cloud-based services have prompted the need for self-hosted solutions. Project-W addresses these issues by providing a private, AI-driven transcription platform based on OpenAI's Whisper general-purpose speech recognition model.

The main goal of Project-W is to offer an easy open-source, scalable transcription solution that ensures data privacy by running entirely on local infrastructure. Specifically, it is designed for environments like universities and research institutions that handle sensitive information. By eliminating the need for cloud services, Project-W safeguards data while leveraging powerful AI models for accurate transcription. It aims to simplify transcription workflows, enabling users to manage their audio processing needs efficiently and securely.

Project-W is built with a Flask-based backend, a Svelte-powered frontend, and Python runners. The backend handles transcription tasks, while the frontend provides an intuitive interface for users to submit, track, and retrieve jobs. Python runners manage the interaction with OpenAI's Whisper AI model, and all components communicate via an HTTP REST API. The platform supports deployment on high-performance hardware, optimizing the processing of large and complex models. Key features include local data storage, user-friendly job management, and scalable infrastructure to handle varying workloads, making it adaptable to diverse environments.

Preliminary testing of Project-W in a university setting demonstrates that the platform is capable of handling significant transcription workloads while maintaining high levels of data security. Its modular architecture allows for customization based on user requirements, such as integrating with institutional servers or enhancing hardware capabilities to improve transcription speed. The platform’s user-friendly web interface streamlines job management, ensuring that even non-technical users can effectively utilize the tool.

Ongoing work focuses on optimizing the platform's performance for large-scale use while actively gathering feedback from both users and administrators to improve functionality and user experience. Further evaluations will be conducted to assess its viability as a central transcription service across other departments, with a view toward broad institutional adoption.
date: 2025
id_scheme: DOI
id_number: 10.11588/heidok.00036417
collection: c-62
ppn_swb: 1923493620
own_urn: urn:nbn:de:bsz:16-heidok-364178
language: eng
bibsort: PARTANENJUINTRODUCIN20250314
full_text_status: public
place_of_pub: Heidelberg
event_title: E-Science-Tage 2025
event_location: Universität Heidelberg
event_dates: 12.03.2025 - 14.03.2025
citation:   Partanen, Julian ; Everling, Markus ; Kempf, Dominic ; Knoth, Tim ; Böhm, Don ; Kepper, Nikolaus ; Baumann, Martin ; Haller, Alexander  (2025) Introducing Project-W: A self-hostable platform for OpenAI’s Whisper.  [Conference Item]     
document_url: https://archiv.ub.uni-heidelberg.de/volltextserver/36417/7/Partanen_introducing_project-w_2025.pdf