Function thread safety mapping

I was trying to locate if Group::Disband is safe to call anywhere or if it is not thread safe.
My assumption was that it is not thread safe since it is calling ObjectAccessor::FindConnectedPlayer, Player::GetGroup and Player::SetGroup
I tried tracking where Group::Disband is called and everywhere it was in opcode handlers that were marked as threadunsafe.

However I found one or more that were not threadunsafe.
WorldSession::HandleBfEntryInviteResponse is STATUS_LOGGEDIN, PROCESS_INPLACE, which would mean it is executed when received.
And since it seems packets are handled on a separate thread the code must be safe to call anywhere (is thread safe).
The function however calls Battlefield::PlayerAcceptInviteToWar which calls Battlefield::AddOrSetPlayerToCorrectBfGroup which calls Group::RemoveMember which calls Group::Disband
and that was assumed not to be thread safe.

Conclusion is that if someone sends the packet for HandleBfEntryInviteResponse and someone in the same group is also removed from the group in any other code (for example the normal group leave opcode?)
it would cause a race condition.

Is there something guaranteeing the safety or is this a possible race condition?

FFR: I used 3.3.5a https://github.com/TrinityCore/TrinityCore/commit/239f0b4ad0e83da9d31c8031aa2e50c294bfc913

Try triggering that [COLOR= rgb(82, 82, 82)]HandleBfEntryInviteResponse() with 2 players on 2 different maps with 2 MapThreads with worldserver running under helgrind, it should spot the race condition for you

did you trigger the race under helgrind ?

No, I dont have space to have or run a VM. I have less than 2 GB drive space.

Calling group disband during map updates is NOT thread safe, and would in fact race condition. HandleBfEntryInviteResponse needs to be changed to PROCESS_THREADUNSAFE if this is the case.

​I sugest you to use http://windirstat.info to found things you can delete, and windows system cleaner tool to remove old updates uninstallers.

could you post some detailed how to reproduce steps to trigger the race condition ?

It looks like you would need to be in a group (probably that isnt even necessary since you are also added to a group) and accept a battlefield invite.
Group object editing is not safe from multiple threads (member add, remove, disband) and when you join, the code seems to quit your current group and join the battlefield group.

Two players would need to be doing this at the exact same time.
I mean that two players in same group that are the only players in the group must accept invite it would look like. Or similar conditions where either the battlefield group is edited (added players should trigger it) or when the group that is left is edited (two players leave the same group by joining BF)

Helgrind didn’t report anything joining a random bg while in group, tested with 3 players. After the bg I was still in the initial group too, is that expected ?

Try log in with 2 characters
do .bf enable 1 to enable wintergrasp, go to the area with .tele wintergrasp and join the que.
Then use .bf start 1 to start it. Everyone should have got the joining message which when you click you get entered to new group.

Mixed battlefield and battleground in earlier description : /

ps. I might be able to test this myself now that I got myself a bigger C drive that can fit the VM

Ah that’s why I didn’t get any report, I tested Random Battleground, not Wintergrasp.

Wintergrasp is disabled by default and should be enabled only by developers who want to work on it, the config file describes it as “experimental” https://github.com/TrinityCore/TrinityCore/blob/3.3.5/src/server/worldserver/worldserver.conf.dist#L2268 .

Your findings are most likely right and there might be many more issues related to Wintergrasp. I’ll give it another try with Helgrind.

I tried to run it and have a group of 3 players where each is on own map
Then one joins wintergrasp and thus leaves the party (this opcode is handled in place, so by a separate thread from maps and world) and then a player leaves the party (this opcode is handled by world thread).
So 2 threads would be editing the group, though the other was disbanding it since there were only 2 players left while the other edit was to remove a player.

However I cant see anything related to this in the log: https://gist.github.com/Rochet2/7ea280d21add3b5edaeb

was that with MapUpdate.Threads > 1 ?

Map update threads were default (1)

From what I understand, the “in place” handled opcodes are handled by a separate thread that is not the main “world” thread nor the map threads. That is why it wouldnt be needed to have more map threads as the bug happens with the packet handler thread and the world thread.

world thread is on hold when updating maps through map threads, you need to specify at least 2 to trigger helgrind reports (or have any real world issue for the matter)

Hmm, I looked around the code and in place is handled in map tick OR the world tick
and threadsafe is handled in map tick and unthreadsafe is handled in world tick.

If this is true, then you are right that should get two map update threads going.
From the comments in code I imagined earlier that there was a separate thread handling receiving opcodes and executing them unless they needed to be map specific or were unthread safe.

​I tele’d to the area with 1 party member, joined the queue, .bf start 1, accepted: the other party members not in Wintergrasp didn’t get any popup.

Just tested sending CMSG_BATTLEFIELD_MGR_ENTRY_INVITE_RESPONSE with botfarm with 4 bots at Dalaran, Honor Hold, Darnassus and Elwynn Forest with 3 Map Threads and helgrind running and all in the same raid with me too, I saw them leaving the party then worldserver stopped with “Killed” message in the console.
We can mark this issue as “reproduced”.

This is the full race condition log, you can see 2 MapThreads both calling indeed the function you said

Possible data race during read of size 8 at 0x53CEBA90 by thread #8
Locks held: none
at 0x163B512: LinkedListHead::isEmpty() const (LinkedList.h:102)
by 0x163B53B: LinkedListHead::getFirst() (in /home/jackpoz/trinity/bin/worldserver)
by 0x17AA6ED: RefManager<Group, Player>::getFirst() (RefManager.h:33)
by 0x17A9089: GroupRefManager::getFirst() (GroupRefManager.h:31)
by 0x1A85AD6: Group::DelinkMember(ObjectGuid) (Group.cpp:2395)
by 0x1A7D871: Group::RemoveMember(ObjectGuid, RemoveMethod const&, ObjectGuid, char const*) (Group.cpp:538)
by 0x1E6B514: Battlefield::AddOrSetPlayerToCorrectBfGroup(Player*) (Battlefield.cpp:509)
by 0x1E6A978: Battlefield::PlayerAcceptInviteToWar(Player*) (Battlefield.cpp:382)
by 0x1F57F06: WorldSession::HandleBfEntryInviteResponse(WorldPacket&) (BattlefieldHandler.cpp:158)
by 0x1C60984: WorldSession::Update(unsigned int, PacketFilter&) (WorldSession.cpp:321)
by 0x1B865B9: Map::Update(unsigned int) (Map.cpp:656)
by 0x1F9323F: MapUpdateRequest::call() (MapUpdater.cpp:42)

This conflicts with a previous write of size 8 by thread #7
Locks held: none
at 0x173504C: LinkedListElement::delink() (LinkedList.h:57)
by 0x18AD488: Reference<Group, Player>::unlink() (Reference.h:64)
by 0x18923DB: Player::SetGroup(Group*, signed char) (Player.cpp:22467)
by 0x1A7D688: Group::RemoveMember(ObjectGuid, RemoveMethod const&, ObjectGuid, char const*) (Group.cpp:509)
by 0x1E6B514: Battlefield::AddOrSetPlayerToCorrectBfGroup(Player*) (Battlefield.cpp:509)
by 0x1E6A978: Battlefield::PlayerAcceptInviteToWar(Player*) (Battlefield.cpp:382)
by 0x1F57F06: WorldSession::HandleBfEntryInviteResponse(WorldPacket&) (BattlefieldHandler.cpp:158)
by 0x1C60984: WorldSession::Update(unsigned int, PacketFilter&) (WorldSession.cpp:321)
Address 0x53ceba90 is 32 bytes inside a block of size 544 alloc’d
at 0x4C2D577: operator new(unsigned long) (vg_replace_malloc.c:333)
by 0x1AE50BF: WorldSession::HandleGroupInviteOpcode(WorldPacket&) (GroupHandler.cpp:164)
by 0x1C60984: WorldSession::Update(unsigned int, PacketFilter&) (WorldSession.cpp:321)
by 0x1D9C65A: World::UpdateSessions(unsigned int) (World.cpp:2712)
by 0x1D9971C: World::Update(unsigned int) (World.cpp:2075)
by 0x157B002: WorldUpdateLoop() (Main.cpp:380)
by 0x1578C9B: main (Main.cpp:242)
Block was alloc’d by thread #1

The whole BattlefieldHandler.cpp looks unsafe, not sanitizing any input (all I had to do was send CMSG_BATTLEFIELD_MGR_ENTRY_INVITE_RESPONSE with Wintergrasp battlefield ID). For example WorldSession::HandleBfExitRequest() modifies Battlefield data even if the player is on another map.

Battlefield-related packets that are supposed to be sent by the client only when already in the battle should check for mapid, the other ones should be set as PROCESS_THREADUNSAFE

https://github.com/TrinityCore/TrinityCore/commit/f07fe63e50062941dd99ce719f27f6d09ecfbd7d