Sounds like the program is doing one of the following:
-bad decoding
-bad mixing
-bad encoding
Decoding and encoding are unlikely culprits as long as the implementation is compliant to the sound compression formats.
What I suspect is happening is that as you mix additional audio signal on top of the original, you introduce clipping on the signal*.
The program probably does a simple interferometric wave format mixing of the several audio sources and doesn't rescale the resultant waveform to avoid clipping. What you should do is export the sound from the video project, remix it in a program that gives you full control over it (Audacity would be perfectly sufficient), export the audio from audacity, then use that mixed audio file as your audio track source for the video.
If the video editing program is worth anything at all it'll have a feature that allows you to specify sound file source. If it doesn't allow that, switch to VirtualDub.
*Short introduction to clipping: It is a result of digital information storage and can happen in image or audio data (among others). Basically if two sample values combined result in a value that is larger than the bit depth of the individual sample, causing the sample to peak at full 0 dB, which wouldn't be a problem if it were just that sample, but there'll be a number of adjacent samples peaked to full 0 dB intensity, and that'll cause noticeably bad sound quality (infamous example of this are the LoudTube videos which are intentionally amplified to introduce severe clipping on audio). On images, white-outs in photos (due to overexposure) are essentially the same thing - causing irreversable loss of information...