opensubscriber
   Find in this group all groups
 
Unknown more information…

d : dri-devel@lists.freedesktop.org 17 July 2012 • 7:28PM -0400

Re: [PATCH] drm/radeon: fix VM page table setup on SI
by Michel Dänzer

REPLY TO AUTHOR
 
REPLY TO GROUP




On Fre, 2012-06-29 at 14:07 -0400, Jerome Glisse wrote:
> On Fri, Jun 29, 2012 at 12:14 PM, Michel Dänzer <michel@daen...> wrote:
> > On Fre, 2012-06-29 at 11:28 -0400, Jerome Glisse wrote:
> >> On Fri, Jun 29, 2012 at 11:23 AM, Alex Deucher <alexdeucher@gmai...> wrote:
> >> > On Fri, Jun 29, 2012 at 10:49 AM, Michel Dänzer <michel@daen...> wrote:
> >> >> On Don, 2012-06-28 at 17:53 -0400, alexdeucher@gmai... wrote:
> >> >>> From: Alex Deucher <alexander.deucher@amd....>
> >> >>>
> >> >>> Cayman and trinity allow for variable sized VM page
> >> >>> tables, but SI requires that all page tables be the
> >> >>> same size.  The current code assumes variablely sized
> >> >>> VM page tables so SI may end up with part of each page
> >> >>> table overlapping with other memory which could end
> >> >>> up being interpreted by the VM hw as garbage.
> >> >>>
> >> >>> Change the code to better accomodate SI.  Allocate enough
> >> >>> space for at least 2 full page tables and always set
> >> >>> last_pfn to max_pfn on SI so each VM is backed by a full
> >> >>> page table.  This limits us to only 2 VMs active at any
> >> >>> given time on SI.  This will be rectified and the code can
> >> >>> be reunified once we move to two level page tables.
> >> >>>
> >> >>> Signed-off-by: Alex Deucher <alexander.deucher@amd....>
> >> >>
> >> >> This change breaks the radeonsi driver for me. egltri_screen (the
> >> >> 'golden' test for radeonsi at least basically working) locks up the
> >> >> GPU.
> >> >>
> >> >> I don't have any details about the lockup yet, as the GPU reset attempt
> >> >> hangs the machine. Any ideas offhand what radeonsi might be doing wrong?
> >> >
> >> > Maybe trying to access an unmapped page that happened to work by
> >> > accident before and now causes a fault in the VM which halts the MC?
> >
> > Indeed, looks like it:
> >
> > radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x000FF01B
> > radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0202400C
> >
> > Oddly, while I have seen similar errors before (so at
> > least some access to unmapped pages was caught even before your patch),
> > I hadn't noticed them for a while with egltri_screen...
> >
> >
> > Anyway, some more experimentation shows that it doesn't happen if I skip
> > the clear, and it still happens when doing only a clear. I'll look into
> > what might be wrong with the clears next week.
> >
> >
> >> Yeah only thing i can think of, can you get dump of various mc fault
> >> reg after lockup ?
> >
> > Did you have any particular registers in mind?
>
> I am guessing it's related to default page behavior, previously to
> this patch you would likely ended up writting/reading to the dummy
> page and thus not getting the segfault you deserved. With this patch
> you get the segfault you deserve ;)

Actually, the problem doesn't occur when applying the patch to current
drm-core-next. I'm guessing it was some kind of backend / tiling setup
issue that's been fixed in the meantime. Thanks for the help anyway,
guys.


--
Earthling Michel Dänzer           |                   http://www.amd.com
Libre software enthusiast         |          Debian, X and DRI developer
_______________________________________________
dri-devel mailing list
dri-devel@list...
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Bookmark with:

Delicious   Digg   reddit   Facebook   StumbleUpon

Related Messages

opensubscriber is not affiliated with the authors of this message nor responsible for its content.