Message ID | 201301172026.45514.arnd@arndb.de |
---|---|
State | New |
Headers | show |
On 17.01.2013 21:26, Arnd Bergmann wrote: > On Thursday 17 January 2013, Soeren Moch wrote: >> On 17.01.2013 11:49, Arnd Bergmann wrote: >>> On Wednesday 16 January 2013, Soeren Moch wrote: >>>>>> I will see what I can do here. Is there an easy way to track the buffer >>>>>> usage without having to wait for complete exhaustion? >>>>> >>>>> DMA_API_DEBUG >>>> >>>> OK, maybe I can try this. >>>>> >>> >>> Any success with this? It should at least tell you if there is a >>> memory leak in one of the drivers. >> >> Not yet, sorry. I have to do all the tests in my limited spare time. >> Can you tell me what to search for in the debug output? > > Actually now that I've looked closer, you can't immediately see > all the mappings as I thought. > > But please try enabling DMA_API_DEBUG in combination with this > one-line patch: > > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c > index 6b2fb87..3df74ac 100644 > --- a/arch/arm/mm/dma-mapping.c > +++ b/arch/arm/mm/dma-mapping.c > @@ -497,6 +497,7 @@ static void *__alloc_from_pool(size_t size, struct page **ret_page) > pr_err_once("ERROR: %u KiB atomic DMA coherent pool is too small!\n" > "Please increase it with coherent_pool= kernel parameter!\n", > (unsigned)pool->size / 1024); > + debug_dma_dump_mappings(NULL); > } > spin_unlock_irqrestore(&pool->lock, flags); > > That will show every single allocation that is currently active. This lets > you see where all the memory went, and if there is a possible leak or > excessive fragmentation. > > Arnd > Please find attached a debug log generated with your patch. I used the sata disk and two em28xx dvb sticks, no other usb devices, no ethernet cable connected, tuners on saa716x-based card not used. What I can see in the log: a lot of coherent mappings from sata_mv and orion_ehci, a few from mv643xx_eth, no other coherent mappings. All coherent mappings are page aligned, some of them (from orion_ehci) are not really small (as claimed in __alloc_from_pool). I don't believe in a memory leak. When I restart vdr (the application utilizing the dvb sticks) then there is enough dma memory available again. Regards, Soeren
On Thu, Jan 17, 2013 at 08:26:45PM +0000, Arnd Bergmann wrote: > On Thursday 17 January 2013, Soeren Moch wrote: > > On 17.01.2013 11:49, Arnd Bergmann wrote: > > > On Wednesday 16 January 2013, Soeren Moch wrote: > > >>>> I will see what I can do here. Is there an easy way to track the buffer > > >>>> usage without having to wait for complete exhaustion? > > >>> > > >>> DMA_API_DEBUG > > >> > > >> OK, maybe I can try this. I tried this. Not what i expected. We have at least one problem with the ethernet driver: WARNING: at lib/dma-debug.c:933 check_unmap+0x4b8/0x8a8() mv643xx_eth_port mv643xx_eth_port.0: DMA-API: device driver failed to check map error[device address=0x000000001f22be00] [size=1536 bytes] [mapped as single] Modules linked in: [<c000db10>] (unwind_backtrace+0x0/0xf4) from [<c0016c44>] (warn_slowpath_common+0x4c/0x64) [<c0016c44>] (warn_slowpath_common+0x4c/0x64) from [<c0016cf0>] (warn_slowpath_fmt+0x30/0x40) [<c0016cf0>] (warn_slowpath_fmt+0x30/0x40) from [<c01ab540>] (check_unmap+0x4b8/0x8a8) [<c01ab540>] (check_unmap+0x4b8/0x8a8) from [<c01abbb8>] (debug_dma_unmap_page+0x8c/0x98) [<c01abbb8>] (debug_dma_unmap_page+0x8c/0x98) from [<c025cb1c>] (mv643xx_eth_poll+0x630/0x800) [<c025cb1c>] (mv643xx_eth_poll+0x630/0x800) from [<c0331d9c>] (net_rx_action+0xcc/0x1d4) [<c0331d9c>] (net_rx_action+0xcc/0x1d4) from [<c001e1b0>] (__do_softirq+0xa8/0x170) [<c001e1b0>] (__do_softirq+0xa8/0x170) from [<c001e3e8>] (do_softirq+0x5c/0x6c) [<c001e3e8>] (do_softirq+0x5c/0x6c) from [<c001e610>] (local_bh_enable+0xcc/0xdc) [<c001e610>] (local_bh_enable+0xcc/0xdc) from [<c0359c74>] (ip_finish_output+0x1c8/0x39c) [<c0359c74>] (ip_finish_output+0x1c8/0x39c) from [<c03571a4>] (ip_local_out+0x28/0x2c) [<c03571a4>] (ip_local_out+0x28/0x2c) from [<c0359564>] (ip_queue_xmit+0x118/0x338) [<c0359564>] (ip_queue_xmit+0x118/0x338) from [<c036e58c>] (tcp_transmit_skb+0x3fc/0x8e4) [<c036e58c>] (tcp_transmit_skb+0x3fc/0x8e4) from [<c0371218>] (tcp_write_xmit+0x228/0xb08) [<c0371218>] (tcp_write_xmit+0x228/0xb08) from [<c0371b6c>] (__tcp_push_pending_frames+0x30/0x9c) [<c0371b6c>] (__tcp_push_pending_frames+0x30/0x9c) from [<c0362940>] (tcp_sendmsg+0x158/0xdc4) [<c0362940>] (tcp_sendmsg+0x158/0xdc4) from [<c0386620>] (inet_sendmsg+0x38/0x74) [<c0386620>] (inet_sendmsg+0x38/0x74) from [<c031f370>] (sock_aio_write+0x12c/0x138) [<c031f370>] (sock_aio_write+0x12c/0x138) from [<c00a62f4>] (do_sync_write+0xa0/0xd0) [<c00a62f4>] (do_sync_write+0xa0/0xd0) from [<c00a6f18>] (vfs_write+0x13c/0x144) [<c00a6f18>] (vfs_write+0x13c/0x144) from [<c00a6ff0>] (sys_write+0x44/0x70) [<c00a6ff0>] (sys_write+0x44/0x70) from [<c0008ce0>] (ret_fast_syscall+0x0/0x2c) ---[ end trace b75faa8779652e63 ]--- I'm getting about 4 errors reported a second from the ethernet driver. Before i look at issues with em28xx i will first try to get the noise from the ethernet driver sorted out. Andrew
> Please find attached a debug log generated with your patch. > > I used the sata disk and two em28xx dvb sticks, no other usb devices, > no ethernet cable connected, tuners on saa716x-based card not used. > > What I can see in the log: a lot of coherent mappings from sata_mv > and orion_ehci, a few from mv643xx_eth, no other coherent mappings. > All coherent mappings are page aligned, some of them (from orion_ehci) > are not really small (as claimed in __alloc_from_pool). > > I don't believe in a memory leak. When I restart vdr (the application > utilizing the dvb sticks) then there is enough dma memory available > again. Hi Soeren We should be able to rule out a leak. Mount debugfg and then: while [ /bin/true ] ; do cat /debug/dma-api/num_free_entries ; sleep 60 ; done while you are capturing. See if the number goes down. Andrew
On 19.01.2013 19:59, Andrew Lunn wrote: >> Please find attached a debug log generated with your patch. >> >> I used the sata disk and two em28xx dvb sticks, no other usb devices, >> no ethernet cable connected, tuners on saa716x-based card not used. >> >> What I can see in the log: a lot of coherent mappings from sata_mv >> and orion_ehci, a few from mv643xx_eth, no other coherent mappings. >> All coherent mappings are page aligned, some of them (from orion_ehci) >> are not really small (as claimed in __alloc_from_pool). >> >> I don't believe in a memory leak. When I restart vdr (the application >> utilizing the dvb sticks) then there is enough dma memory available >> again. > > Hi Soeren > > We should be able to rule out a leak. Mount debugfg and then: > > while [ /bin/true ] ; do cat /debug/dma-api/num_free_entries ; sleep 60 ; done > > while you are capturing. See if the number goes down. > > Andrew Now I built a kernel with debugfs enabled. It is not clear to me what I can see from the dma-api/num_free_entries output. After reboot (vdr running) I see decreasing numbers (3453 3452 3445 3430...), min_free_entries is lower (3390). Sometimes the output is constant for several minutes ( 3396 3396 3396 3396 3396,...) Soeren
On Wed, Jan 23, 2013 at 04:30:53PM +0100, Soeren Moch wrote: > On 19.01.2013 19:59, Andrew Lunn wrote: > >>Please find attached a debug log generated with your patch. > >> > >>I used the sata disk and two em28xx dvb sticks, no other usb devices, > >>no ethernet cable connected, tuners on saa716x-based card not used. > >> > >>What I can see in the log: a lot of coherent mappings from sata_mv > >>and orion_ehci, a few from mv643xx_eth, no other coherent mappings. > >>All coherent mappings are page aligned, some of them (from orion_ehci) > >>are not really small (as claimed in __alloc_from_pool). > >> > >>I don't believe in a memory leak. When I restart vdr (the application > >>utilizing the dvb sticks) then there is enough dma memory available > >>again. > > > >Hi Soeren > > > >We should be able to rule out a leak. Mount debugfg and then: > > > >while [ /bin/true ] ; do cat /debug/dma-api/num_free_entries ; sleep 60 ; done > > > >while you are capturing. See if the number goes down. > > > > Andrew > > Now I built a kernel with debugfs enabled. > It is not clear to me what I can see from the > dma-api/num_free_entries output. After reboot (vdr running) I see > decreasing numbers (3453 3452 3445 3430...), min_free_entries is > lower (3390). Sometimes the output is constant for several minutes ( > 3396 3396 3396 3396 3396,...) We are interesting in the long term behavior. Does it gradually go down? Or is it stable? If it goes down over time, its clearly a leak somewhere. Andrew
On 23.01.2013 17:25, Andrew Lunn wrote: > On Wed, Jan 23, 2013 at 04:30:53PM +0100, Soeren Moch wrote: >> On 19.01.2013 19:59, Andrew Lunn wrote: >>>> Please find attached a debug log generated with your patch. >>>> >>>> I used the sata disk and two em28xx dvb sticks, no other usb devices, >>>> no ethernet cable connected, tuners on saa716x-based card not used. >>>> >>>> What I can see in the log: a lot of coherent mappings from sata_mv >>>> and orion_ehci, a few from mv643xx_eth, no other coherent mappings. >>>> All coherent mappings are page aligned, some of them (from orion_ehci) >>>> are not really small (as claimed in __alloc_from_pool). >>>> >>>> I don't believe in a memory leak. When I restart vdr (the application >>>> utilizing the dvb sticks) then there is enough dma memory available >>>> again. >>> >>> Hi Soeren >>> >>> We should be able to rule out a leak. Mount debugfg and then: >>> >>> while [ /bin/true ] ; do cat /debug/dma-api/num_free_entries ; sleep 60 ; done >>> >>> while you are capturing. See if the number goes down. >>> >>> Andrew >> >> Now I built a kernel with debugfs enabled. >> It is not clear to me what I can see from the >> dma-api/num_free_entries output. After reboot (vdr running) I see >> decreasing numbers (3453 3452 3445 3430...), min_free_entries is >> lower (3390). Sometimes the output is constant for several minutes ( >> 3396 3396 3396 3396 3396,...) > > We are interesting in the long term behavior. Does it gradually go > down? Or is it stable? If it goes down over time, its clearly a leak > somewhere. > Now (in the last hour) stable, occasionally lower numbers: 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3365 3396 3394 3396 3396 3396 3396 3373 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3353 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3394 3396 3396 3396 3396 3396 3396 3396 Before the last pool exhaustion going down: 3395 3395 3389 3379 3379 3374 3367 3360 3352 3343 3343 3343 3342 3336 3332 3324 3318 3314 3310 3307 3305 3299 3290 3283 3279 3272 3266 3265 3247 3247 3247 3242 3236 3236 Soeren
On 23.01.2013 18:07, Soeren Moch wrote: > On 23.01.2013 17:25, Andrew Lunn wrote: >> On Wed, Jan 23, 2013 at 04:30:53PM +0100, Soeren Moch wrote: >>> On 19.01.2013 19:59, Andrew Lunn wrote: >>>>> Please find attached a debug log generated with your patch. >>>>> >>>>> I used the sata disk and two em28xx dvb sticks, no other usb devices, >>>>> no ethernet cable connected, tuners on saa716x-based card not used. >>>>> >>>>> What I can see in the log: a lot of coherent mappings from sata_mv >>>>> and orion_ehci, a few from mv643xx_eth, no other coherent mappings. >>>>> All coherent mappings are page aligned, some of them (from orion_ehci) >>>>> are not really small (as claimed in __alloc_from_pool). >>>>> >>>>> I don't believe in a memory leak. When I restart vdr (the application >>>>> utilizing the dvb sticks) then there is enough dma memory available >>>>> again. >>>> >>>> Hi Soeren >>>> >>>> We should be able to rule out a leak. Mount debugfg and then: >>>> >>>> while [ /bin/true ] ; do cat /debug/dma-api/num_free_entries ; sleep >>>> 60 ; done >>>> >>>> while you are capturing. See if the number goes down. >>>> >>>> Andrew >>> >>> Now I built a kernel with debugfs enabled. >>> It is not clear to me what I can see from the >>> dma-api/num_free_entries output. After reboot (vdr running) I see >>> decreasing numbers (3453 3452 3445 3430...), min_free_entries is >>> lower (3390). Sometimes the output is constant for several minutes ( >>> 3396 3396 3396 3396 3396,...) >> >> We are interesting in the long term behavior. Does it gradually go >> down? Or is it stable? If it goes down over time, its clearly a leak >> somewhere. >> > > Now (in the last hour) stable, occasionally lower numbers: > 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 > 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 > 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 > 3396 3396 3396 3396 3396 3396 3396 3396 3396 3365 3396 3394 3396 3396 > 3396 3396 3373 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 > 3396 3353 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 > 3394 3396 3396 3396 3396 3396 3396 3396 > > Before the last pool exhaustion going down: > 3395 3395 3389 3379 3379 3374 3367 3360 3352 3343 3343 3343 3342 3336 > 3332 3324 3318 3314 3310 3307 3305 3299 3290 3283 3279 3272 3266 3265 > 3247 3247 3247 3242 3236 3236 > Here I stopped vdr (and so closed all dvb_demux devices), the number was remaining the same 3236, even after restart of vdr (and restart of streaming). > Soeren
> >> > > > >Now (in the last hour) stable, occasionally lower numbers: > >3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 > >3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 > >3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 > >3396 3396 3396 3396 3396 3396 3396 3396 3396 3365 3396 3394 3396 3396 > >3396 3396 3373 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 > >3396 3353 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 > >3394 3396 3396 3396 3396 3396 3396 3396 > > > >Before the last pool exhaustion going down: > >3395 3395 3389 3379 3379 3374 3367 3360 3352 3343 3343 3343 3342 3336 > >3332 3324 3318 3314 3310 3307 3305 3299 3290 3283 3279 3272 3266 3265 > >3247 3247 3247 3242 3236 3236 > > > Here I stopped vdr (and so closed all dvb_demux devices), the number > was remaining the same 3236, even after restart of vdr (and restart > of streaming). So it does suggest a leak. Probably somewhere on an error path, e.g. its lost video sync. Andrew
On 23.01.2013 19:10, Andrew Lunn wrote: >>>> >>> >>> Now (in the last hour) stable, occasionally lower numbers: >>> 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 >>> 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 >>> 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 >>> 3396 3396 3396 3396 3396 3396 3396 3396 3396 3365 3396 3394 3396 3396 >>> 3396 3396 3373 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 >>> 3396 3353 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 >>> 3394 3396 3396 3396 3396 3396 3396 3396 >>> >>> Before the last pool exhaustion going down: >>> 3395 3395 3389 3379 3379 3374 3367 3360 3352 3343 3343 3343 3342 3336 >>> 3332 3324 3318 3314 3310 3307 3305 3299 3290 3283 3279 3272 3266 3265 >>> 3247 3247 3247 3242 3236 3236 >>> >> Here I stopped vdr (and so closed all dvb_demux devices), the number >> was remaining the same 3236, even after restart of vdr (and restart >> of streaming). > > So it does suggest a leak. Probably somewhere on an error path, > e.g. its lost video sync. > Now I activated the debug messages in em28xx. From the messages I see no correlation of the pool exhaustion and lost sync. Also I cannot see any error messages from the em28xx driver. I see a lot of init_isoc/stop_urbs (maybe EPG scan?) without draining the coherent pool (checked with 'cat /debug/dma-api/num_free_entries', which gave stable numbers), but after half an hour there are only init_isoc messages without corresponding stop_urbs messages and num_free_entries decreased until coherent pool exhaustion. Any idea where the memory leak is? What is allocating coherent buffers for orion-ehci? Soeren Jan 28 20:46:03 guruvdr kernel: em28xx #0/2-dvb: Using 5 buffers each with 64 x 940 bytes Jan 28 20:46:03 guruvdr kernel: em28xx #0 em28xx_init_isoc :em28xx: called em28xx_init_isoc in mode 2 Jan 28 20:46:03 guruvdr kernel: em28xx #1/2-dvb: Using 5 buffers each with 64 x 940 bytes Jan 28 20:46:03 guruvdr kernel: em28xx #1 em28xx_init_isoc :em28xx: called em28xx_init_isoc in mode 2 Jan 28 20:46:23 guruvdr kernel: em28xx #0 em28xx_stop_urbs :em28xx: called em28xx_stop_urbs Jan 28 20:46:23 guruvdr kernel: em28xx #1 em28xx_stop_urbs :em28xx: called em28xx_stop_urbs Jan 28 20:46:24 guruvdr kernel: em28xx #0/2-dvb: Using 5 buffers each with 64 x 940 bytes Jan 28 20:46:24 guruvdr kernel: em28xx #0 em28xx_init_isoc :em28xx: called em28xx_init_isoc in mode 2 Jan 28 20:46:24 guruvdr kernel: em28xx #1/2-dvb: Using 5 buffers each with 64 x 940 bytes Jan 28 20:46:24 guruvdr kernel: em28xx #1 em28xx_init_isoc :em28xx: called em28xx_init_isoc in mode 2 Jan 28 20:46:44 guruvdr kernel: em28xx #1 em28xx_stop_urbs :em28xx: called em28xx_stop_urbs Jan 28 20:46:44 guruvdr kernel: em28xx #0 em28xx_stop_urbs :em28xx: called em28xx_stop_urbs Jan 28 20:46:45 guruvdr kernel: em28xx #1/2-dvb: Using 5 buffers each with 64 x 940 bytes Jan 28 20:46:45 guruvdr kernel: em28xx #1 em28xx_init_isoc :em28xx: called em28xx_init_isoc in mode 2 Jan 28 20:46:45 guruvdr kernel: em28xx #0/2-dvb: Using 5 buffers each with 64 x 940 bytes Jan 28 20:46:45 guruvdr kernel: em28xx #0 em28xx_init_isoc :em28xx: called em28xx_init_isoc in mode 2 Jan 28 20:54:33 guruvdr kernel: ERROR: 1024 KiB atomic DMA coherent pool is too small! Jan 28 20:54:33 guruvdr kernel: Please increase it with coherent_pool= kernel parameter!
On Mon, Jan 28, 2013 at 09:59:18PM +0100, Soeren Moch wrote: > On 23.01.2013 19:10, Andrew Lunn wrote: > >>>> > >>> > >>>Now (in the last hour) stable, occasionally lower numbers: > >>>3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 > >>>3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 > >>>3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 > >>>3396 3396 3396 3396 3396 3396 3396 3396 3396 3365 3396 3394 3396 3396 > >>>3396 3396 3373 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 > >>>3396 3353 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 3396 > >>>3394 3396 3396 3396 3396 3396 3396 3396 > >>> > >>>Before the last pool exhaustion going down: > >>>3395 3395 3389 3379 3379 3374 3367 3360 3352 3343 3343 3343 3342 3336 > >>>3332 3324 3318 3314 3310 3307 3305 3299 3290 3283 3279 3272 3266 3265 > >>>3247 3247 3247 3242 3236 3236 > >>> > >>Here I stopped vdr (and so closed all dvb_demux devices), the number > >>was remaining the same 3236, even after restart of vdr (and restart > >>of streaming). > > > >So it does suggest a leak. Probably somewhere on an error path, > >e.g. its lost video sync. > > > > Now I activated the debug messages in em28xx. From the messages I > see no correlation of the pool exhaustion and lost sync. Also I > cannot see any error messages from the em28xx driver. > I see a lot of init_isoc/stop_urbs (maybe EPG scan?) without > draining the coherent pool (checked with 'cat > /debug/dma-api/num_free_entries', which gave stable numbers), but > after half an hour there are only init_isoc messages without > corresponding stop_urbs messages and num_free_entries decreased > until coherent pool exhaustion. > > Any idea where the memory leak is? What is allocating coherent > buffers for orion-ehci? Keeping in mind that I am completely unfamiliar with usb dvb, my best guess is that the problem is in em28xx-core.c:1131 According to your log messages, it is in mode 2, which is EM28XX_DIGITAL_MODE. There seem to be good hints in 86d38d1e [media] em28xx: pre-allocate DVB isoc transfer buffers I added the relevant parties to the To:... For Gianluca and Mauro, the whole thread may be found at: http://markmail.org/message/wm4wlgzoudixd4so#query:+page:1+mid:o7phz7cosmwpcsrz+state:results thx, Jason. > > Soeren > > > Jan 28 20:46:03 guruvdr kernel: em28xx #0/2-dvb: Using 5 buffers > each with 64 x 940 bytes > Jan 28 20:46:03 guruvdr kernel: em28xx #0 em28xx_init_isoc :em28xx: > called em28xx_init_isoc in mode 2 > Jan 28 20:46:03 guruvdr kernel: em28xx #1/2-dvb: Using 5 buffers > each with 64 x 940 bytes > Jan 28 20:46:03 guruvdr kernel: em28xx #1 em28xx_init_isoc :em28xx: > called em28xx_init_isoc in mode 2 > Jan 28 20:46:23 guruvdr kernel: em28xx #0 em28xx_stop_urbs :em28xx: > called em28xx_stop_urbs > Jan 28 20:46:23 guruvdr kernel: em28xx #1 em28xx_stop_urbs :em28xx: > called em28xx_stop_urbs > Jan 28 20:46:24 guruvdr kernel: em28xx #0/2-dvb: Using 5 buffers > each with 64 x 940 bytes > Jan 28 20:46:24 guruvdr kernel: em28xx #0 em28xx_init_isoc :em28xx: > called em28xx_init_isoc in mode 2 > Jan 28 20:46:24 guruvdr kernel: em28xx #1/2-dvb: Using 5 buffers > each with 64 x 940 bytes > Jan 28 20:46:24 guruvdr kernel: em28xx #1 em28xx_init_isoc :em28xx: > called em28xx_init_isoc in mode 2 > Jan 28 20:46:44 guruvdr kernel: em28xx #1 em28xx_stop_urbs :em28xx: > called em28xx_stop_urbs > Jan 28 20:46:44 guruvdr kernel: em28xx #0 em28xx_stop_urbs :em28xx: > called em28xx_stop_urbs > Jan 28 20:46:45 guruvdr kernel: em28xx #1/2-dvb: Using 5 buffers > each with 64 x 940 bytes > Jan 28 20:46:45 guruvdr kernel: em28xx #1 em28xx_init_isoc :em28xx: > called em28xx_init_isoc in mode 2 > Jan 28 20:46:45 guruvdr kernel: em28xx #0/2-dvb: Using 5 buffers > each with 64 x 940 bytes > Jan 28 20:46:45 guruvdr kernel: em28xx #0 em28xx_init_isoc :em28xx: > called em28xx_init_isoc in mode 2 > Jan 28 20:54:33 guruvdr kernel: ERROR: 1024 KiB atomic DMA coherent > pool is too small! > Jan 28 20:54:33 guruvdr kernel: Please increase it with > coherent_pool= kernel parameter! >
> Now I activated the debug messages in em28xx. From the messages I > see no correlation of the pool exhaustion and lost sync. Also I > cannot see any error messages from the em28xx driver. > I see a lot of init_isoc/stop_urbs (maybe EPG scan?) without > draining the coherent pool (checked with 'cat > /debug/dma-api/num_free_entries', which gave stable numbers), but > after half an hour there are only init_isoc messages without > corresponding stop_urbs messages and num_free_entries decreased > until coherent pool exhaustion. Hi Soeren em28xx_stop_urbs() is only called by em28xx_stop_streaming(). em28xx_stop_streaming() is only called by em28xx_stop_feed() when 0 == dvb->nfeeds. em28xx_stop_feed()and em28xx_start_feed() look O.K, dvb->nfeeds is protected by a mutex etc. Now, em28xx_init_isoc() is also called by buffer_prepare(). This uses em28xx_alloc_isoc() to do the actual allocation, and that function sets up the urb such that on completion the function em28xx_irq_callback() is called. It looks like there might be issues here: Once the data has been copied out, it resubmits the urb: urb->status = usb_submit_urb(urb, GFP_ATOMIC); if (urb->status) { em28xx_isocdbg("urb resubmit failed (error=%i)\n", urb->status); } However, if the ubs_submit_urb fails, it looks like the urb is lost. If you look at other code submitting urbs you have this pattern: rc = usb_submit_urb(isoc_bufs->urb[i], GFP_ATOMIC); if (rc) { em28xx_err("submit of urb %i failed (error=%i)\n", i, rc); em28xx_uninit_isoc(dev, mode); return rc; } Do you have your build such that you would see "urb resubmit failed" in your logs? Are there any? Andrew
On 29.01.2013 12:02, Andrew Lunn wrote: >> Now I activated the debug messages in em28xx. From the messages I >> see no correlation of the pool exhaustion and lost sync. Also I >> cannot see any error messages from the em28xx driver. >> I see a lot of init_isoc/stop_urbs (maybe EPG scan?) without >> draining the coherent pool (checked with 'cat >> /debug/dma-api/num_free_entries', which gave stable numbers), but >> after half an hour there are only init_isoc messages without >> corresponding stop_urbs messages and num_free_entries decreased >> until coherent pool exhaustion. > > Hi Soeren > > em28xx_stop_urbs() is only called by em28xx_stop_streaming(). > > em28xx_stop_streaming() is only called by em28xx_stop_feed() > when 0 == dvb->nfeeds. > > em28xx_stop_feed()and em28xx_start_feed() look O.K, dvb->nfeeds is > protected by a mutex etc. > > Now, em28xx_init_isoc() is also called by buffer_prepare(). This uses > em28xx_alloc_isoc() to do the actual allocation, and that function > sets up the urb such that on completion the function > em28xx_irq_callback() is called. > > It looks like there might be issues here: > > Once the data has been copied out, it resubmits the urb: > > urb->status = usb_submit_urb(urb, GFP_ATOMIC); > if (urb->status) { > em28xx_isocdbg("urb resubmit failed (error=%i)\n", > urb->status); > } > > However, if the ubs_submit_urb fails, it looks like the urb is lost. > > If you look at other code submitting urbs you have this pattern: > > rc = usb_submit_urb(isoc_bufs->urb[i], GFP_ATOMIC); > if (rc) { > em28xx_err("submit of urb %i failed (error=%i)\n", i, > rc); > em28xx_uninit_isoc(dev, mode); > return rc; > } > > Do you have your build such that you would see "urb resubmit failed" > in your logs? Are there any? I only had "urb resubmit failed" messages _after_ the coherent pool exhaustion. So I guess something below the usb_submit_urb call is allocating (too much) memory, sometimes. Or can dvb_demux allocate memory and blame orion-ehci for it? Soeren
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 6b2fb87..3df74ac 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -497,6 +497,7 @@ static void *__alloc_from_pool(size_t size, struct page **ret_page) pr_err_once("ERROR: %u KiB atomic DMA coherent pool is too small!\n" "Please increase it with coherent_pool= kernel parameter!\n", (unsigned)pool->size / 1024); + debug_dma_dump_mappings(NULL); } spin_unlock_irqrestore(&pool->lock, flags);