eprintid: 36417 rev_number: 15 eprint_status: archive userid: 8572 dir: disk0/00/03/64/17 datestamp: 2025-04-17 09:37:52 lastmod: 2025-04-23 12:14:31 status_changed: 2025-04-17 09:37:52 type: conferenceObject metadata_visibility: show creators_name: Partanen, Julian creators_name: Everling, Markus creators_name: Kempf, Dominic creators_name: Knoth, Tim creators_name: Böhm, Don creators_name: Kepper, Nikolaus creators_name: Baumann, Martin creators_name: Haller, Alexander title: Introducing Project-W: A self-hostable platform for OpenAI’s Whisper divisions: i-704000 pres_type: poster abstract: Speech-to-text technologies, driven by advancements in artificial intelligence, are increasingly beneficial to sectors like research and education. These systems enable the transcription of vast audio data, making it easier to process, analyze, and archive information. However, concerns over data privacy and reliance on cloud-based services have prompted the need for self-hosted solutions. Project-W addresses these issues by providing a private, AI-driven transcription platform based on OpenAI's Whisper general-purpose speech recognition model. The main goal of Project-W is to offer an easy open-source, scalable transcription solution that ensures data privacy by running entirely on local infrastructure. Specifically, it is designed for environments like universities and research institutions that handle sensitive information. By eliminating the need for cloud services, Project-W safeguards data while leveraging powerful AI models for accurate transcription. It aims to simplify transcription workflows, enabling users to manage their audio processing needs efficiently and securely. Project-W is built with a Flask-based backend, a Svelte-powered frontend, and Python runners. The backend handles transcription tasks, while the frontend provides an intuitive interface for users to submit, track, and retrieve jobs. Python runners manage the interaction with OpenAI's Whisper AI model, and all components communicate via an HTTP REST API. The platform supports deployment on high-performance hardware, optimizing the processing of large and complex models. Key features include local data storage, user-friendly job management, and scalable infrastructure to handle varying workloads, making it adaptable to diverse environments. Preliminary testing of Project-W in a university setting demonstrates that the platform is capable of handling significant transcription workloads while maintaining high levels of data security. Its modular architecture allows for customization based on user requirements, such as integrating with institutional servers or enhancing hardware capabilities to improve transcription speed. The platform’s user-friendly web interface streamlines job management, ensuring that even non-technical users can effectively utilize the tool. Ongoing work focuses on optimizing the platform's performance for large-scale use while actively gathering feedback from both users and administrators to improve functionality and user experience. Further evaluations will be conducted to assess its viability as a central transcription service across other departments, with a view toward broad institutional adoption. date: 2025 id_scheme: DOI id_number: 10.11588/heidok.00036417 collection: c-62 ppn_swb: 1923493620 own_urn: urn:nbn:de:bsz:16-heidok-364178 language: eng bibsort: PARTANENJUINTRODUCIN20250314 full_text_status: public place_of_pub: Heidelberg event_title: E-Science-Tage 2025 event_location: Universität Heidelberg event_dates: 12.03.2025 - 14.03.2025 citation: Partanen, Julian ; Everling, Markus ; Kempf, Dominic ; Knoth, Tim ; Böhm, Don ; Kepper, Nikolaus ; Baumann, Martin ; Haller, Alexander (2025) Introducing Project-W: A self-hostable platform for OpenAI’s Whisper. [Conference Item] document_url: https://archiv.ub.uni-heidelberg.de/volltextserver/36417/7/Partanen_introducing_project-w_2025.pdf