Spaces:
Sleeping
Sleeping
Commit
·
3127fd1
1
Parent(s):
23c74d4
Update README.md
Browse files
README.md
CHANGED
@@ -12,10 +12,13 @@ pinned: false
|
|
12 |
# EE 298 DL Assignment 1 (2S2021-22) by Paul Darvin
|
13 |
Demo Application for Sound Event Detection in Huggingface Space
|
14 |
## Link to Original/Reference Code
|
15 |
-
The codes contained in this repository were derived only from [PANNs inference](https://github.com/qiuqiangkong/panns_inference) Github repository which is an extension of the mother repository for the paper [PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition](https://
|
16 |
## Background
|
17 |
An sound event detection system is an audio tagging system applied to time segments of an audio signal. It identifies tags like the presence of an object, a living thing, and an action that generates sound in a particular time.
|
18 |
## Significance
|
19 |
-
Applications of sound event detection system are wide-ranging. For instance, a deaf person can use such system to detect an approaching vehicle or watch a movie with sounds described to him/her/them. It can aid in forensics for identifying presence of objects and actions in an audio evidence. It can also be used to navigate through a large audio file using time-indexed tags.
|
|
|
|
|
|
|
20 |
## Usage
|
21 |
Upload an audio file in WAV format. Other formats are not yet supported.
|
|
|
12 |
# EE 298 DL Assignment 1 (2S2021-22) by Paul Darvin
|
13 |
Demo Application for Sound Event Detection in Huggingface Space
|
14 |
## Link to Original/Reference Code
|
15 |
+
The codes contained in this repository were derived only from [PANNs inference](https://github.com/qiuqiangkong/panns_inference) Github repository which is an extension of the [mother repository](https://github.com/qiuqiangkong/audioset_tagging_cnn) for the paper [PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition](https://arxiv.org/pdf/1912.10211v5.pdf).
|
16 |
## Background
|
17 |
An sound event detection system is an audio tagging system applied to time segments of an audio signal. It identifies tags like the presence of an object, a living thing, and an action that generates sound in a particular time.
|
18 |
## Significance
|
19 |
+
Applications of sound event detection system are wide-ranging. For instance, a deaf person can use such system to detect an approaching vehicle or watch a movie with sounds described to him/her/them. It can aid in forensics for identifying presence of objects and actions in an audio evidence. It can also be used to navigate through a large audio file using time-indexed tags. Robots can be made more "human" by giving the ability to interpret audio signals the way humans do.
|
20 |
+
## Model Description
|
21 |
+
CNN14 is 14-layer convolutional neural network with 6 convolution layers. It uses a log-mel spectrogram with 1000 frames and 64 mel bins at the topmost layer to translate audio data to image data. The details of the architecture can be found in the [paper](https://arxiv.org/pdf/1912.10211v5.pdf).
|
22 |
+
The authors claimed to achieve mean average precision (mAP) of 0.431 for CNN14 which exceeded the best system's mAP (0.392) at the time of publication.
|
23 |
## Usage
|
24 |
Upload an audio file in WAV format. Other formats are not yet supported.
|