Quantcast
Channel: Active questions tagged https - Stack Overflow
Viewing all articles
Browse latest Browse all 1535

Python: Post Many Audio Files and Get the Results

$
0
0

Python

I have thousands of audio files on my local computer and I run an API with FastAPI that extracts features from audio files and sends these features back to me (json).

The procedure is as follows: First I send the audio file and then I get a token back. I can then use the token to check again and again whether processing has already been completed. If so, I get the features back.

I managed to post one audio file with:

url = 'https://something/submit/'path = 'dir/interview-10001.wav'with open(path, 'rb') as fobj:    res = requests.post(url, files={'file': fobj})print(res.text){"message":"Submitted successfully","token":"abcd"}

... and I managed to get the features for one file with:

url = 'https://something/features/abcd'res_feat = requests.get(url)res_feat.json(){'filesize': 253812,'language': 'de','language_prob': 0.9755016565322876,'n_words': 15, ... }

How can I now make hundreds or thousands of requests in parallel or asynchronously and collect the results?

I tried e.g. this code (https://dev.to/ndrbrt/python-upload-multiple-files-concurrently-with-aiohttp-and-show-progress-bars-with-tqdm-32l7) in a jupyter notebook:

class FileManager():    def __init__(self, file_name: str):        self.name = file_name        self.size = os.path.getsize(self.name)        self.pbar = None    def __init_pbar(self):        self.pbar = tqdm(            total=self.size,            desc=self.name,            unit='B',            unit_scale=True,            unit_divisor=1024,            leave=True)    async def file_reader(self):        self.__init_pbar()        chunk_size = 64*1024        async with aiofiles.open(self.name, 'rb') as f:            chunk = await f.read(chunk_size)            while chunk:                self.pbar.update(chunk_size)                yield chunk                chunk = await f.read(chunk_size)            self.pbar.close()async def upload(file: FileManager, url: str, session: aiohttp.ClientSession):    try:        data = {'file': open(file.name, 'rb')}        async with session.post(url, data={'data': data}) as res:            # NB: if you also need the response content, you have to await it            return res    except Exception as e:        # handle error(s) according to your needs        print(e)async def main(files):    url = 'https://something/submit/'    files = [FileManager(file) for file in files]    async with aiohttp.ClientSession() as session:        res = await asyncio.gather(*[upload(file, url, session) for file in files])    print(f'All files have been uploaded ({len(res)})')    return res

I started it with:

path_1 = '/media/SPEAKER_01/interview-10001.wav'path_2 = '/media/SPEAKER_00/interview-10001.wav'files = [path_1, path_2]res = await main(files)res

But I get back:

[422 Unprocessable Entity]><CIMultiDictProxy('Content-Length': '89', 'Content-Type': 'application/json', 'Date': 'Fri, 19 Jul 2024 09:54:29 GMT', 'Server': 'envoy', 'x-envoy-upstream-service-time': '23')>, ...

Viewing all articles
Browse latest Browse all 1535

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>