Multi ‐ agent control in modular motor drives by means of gossip consensus

A modular motor drive (MMD) can be considered as a multi ‐ agent system, in which the agents work together to reach a common goal. One agent in such an MMD comprises only a subset of the concentrated windings and power electronic converter modules and is equipped with an independent controller. The goal of this research is to let the agents cooperate to distribute a torque demand among them in a completely decentralised way, in order to comply with the flexibility and reliability of modular motor drives. However, the current condition of each agent must still be taken into account during this torque allocation. To limit the communication load, only communication between neighbouring agents is allowed. The combination of a gossip consensus algorithm and multiple decentralised PI current controllers is used for this purpose. Simulations and experimental results on a modular axial ‐ flux permanent magnet synchronous machine consisting of five agents confirm that the total torque demand is delivered under the proposed control strategy, even during agent malfunctions, under random communication link failures and with non ‐ identical agents.


| INTRODUCTION
A modular motor drive (MMD) is composed of several identical motor modules, fed by a dedicated power electronic (PE) converter.A motor module comprises a stator core element and a stator winding.The combination of a motor module and its PE converter module is often referred to as a pole drive unit (PDU) in literature [1,2].A pole drive unit can be used as building blocks for electric drives, giving advantageous economies of scales effect, while still meeting the need for flexibility and reliability in present-day applications [3,4].Other advantages of this modular approach include additional degrees-of-freedom for advanced control strategies, power rating scalability, overall drive system size and cost optimisation, easy manufacturing, better thermal performance, and increased efficiency and fault tolerance compared to traditional drives [1][2][3][4][5][6][7][8].
The full potential of MMDs can only be exploited if not only the hardware is modular but also the control.After all, a centralised controller constitutes a single point of failure, and it requires a complete redesign when there is a change or a fault in one of the PDUs.For these reasons, MMDs are often equipped with multiple controllers, which only regulate a limited set of PDUs.These controllers run identical control programs and operate as synchronised peers [2].In the literature, a common choice for this distributed control approach is to divide the n PDUs into p sets of three PDUs (n = 3 � p), giving rise to socalled multi-three-phase machines [9,10].Each set of three PDUs is then equipped with an independent three-phase controller.In this way, the MMD can combine the benefits of the well-consolidated three-phase technology and control strategies, with the fault tolerance and flexibility of modularity.
Another major advantage of these modular multi-threephase machines is that not all the sets of three PDUs need to contribute equally to a total torque demand (which is proportional to a certain power demand under constant speed).The (un)equal distribution of the torque or power demand over the sets of PDUs is obtained by means of so-called sharing coefficients [10][11][12].These sharing coefficients scale the current set-points and hence the current amplitudes in the different sets.When one of the PDUs in a set is faulted, for instance, its sharing coefficient is set equal to zero while the current set-points in the healthy sets are increased (with respect for their current limits) to satisfy the torque or power demand.However, all these papers about current sharing have in common that the distribution of the torque or power demand over the different sets (i.e. the determination of the sharing coefficients) happens a priori offline or by means of a central supervisory controller, which conflicts with the modular approach in MMDs.
In this light, it is very interesting to notice that a set of PDUs and its controller in an MMD can be considered as an agent in a multi-agent system.The agents can work together to obtain disturbance rejection [13], distributed optimisation [14,15], failure diagnosis [16], or, in general, to reach an agreement on a particular quantity of interest [17].A system of agents that interact locally to reach an agreement on the average value of their initial states by means of local computations only is solving the so-called static average consensus problem [18].A communication network is established between the agents for this purpose.To solve the static average consensus problem, a wide range of algorithms exist, varying in terms of the required timing, nature and topology of the communication network [18].
The goal of this research is hence to use a multi-agent algorithm for the online, completely decentralised distribution of a torque demand among the agents of an MMD, which still takes into account the present state of each agent.The algorithm used for this purpose is inspired by the gossip static average consensus approach for the economic dispatch problem in a smart grid [19,20].The agents in the MMD are only allowed to communicate with their two closest neighbours, after which they try to reach a consensus on their so-called incremental costs by means of this partial information and local calculations only.The incremental cost indicates for each agent individually how (dis)advantageous it would be to change the portion of the total torque demand that it needs to deliver.In this way, the agents can decide among themselves how the total torque demand is allocated, based on each agent's knowledge of its own condition, and without the need for a central supervisory controller.This online, decentralised torque allocation in MMDs that considers the present condition of each agent is the key contribution of this paper.The torque allocation is outputted in the form of agentspecific current set-points.The currents in the PDUs of each agent are then controlled towards these set-points in a completely autonomous way, by means of the state-of-the-art, decentralised, multi-three-phase PI current control strategy proposed in [9].
An important advantage of the use of a gossip consensus algorithm is that it is in close alignment with the inherent working principles of existing communication protocols.Three different situations will be studied in this research: the gossip consensus algorithm is updated when � one agent has sent data to one of its two closest neighbours; � two neighbouring agents have exchanged data; � one agent has sent data to its two closest neighbours.
These three communication strategies are depicted in Figure 1 for a network consisting of five agents.The goal of this work is not to provide an extensive comparison between these three strategies but to show the feasibility of all three of them.Which communication strategy is used depends on the type of communication protocol that is compatible with the controllers in the MMD.Contrary to a so-called deterministic consensus algorithm, the execution of the gossip consensus algorithm does not need to be postponed until all agents have confirmed that they have received information from their two closest neighbours [18].This results in less stringent clock synchronisation requirements between the controllers of the agents.Furthermore, it will be shown that the gossip consensus algorithm is robust against random communication link failures.These random link failures, due to noise, for instance, are likely to occur in real-life communication networks [18][19][20].
Although the methods proposed in this work can be applied for any number of PDUs n and any number of agents p, they will be demonstrated specifically for the case of a 4-kW segmented armature torus (SAT) axial-flux permanent magnet synchronous machine (AFPMSM) with surface-mounted permanent magnets, with n = 15 PDUs combined into p = 5 agents of three PDUs.A proper definition for an MMD architecture is presented in Section 2 and focussed on the example of this case study.The gossip static average consensus algorithm for the determination of the current set-points for the different agents is elaborated in Section 3. The state-of-theart decentralised PI current control strategy required to bring the currents in the agents towards these set-points is summarised in Section 4. Finally, the proposed multi-agent control strategy (which is the combination of the gossip consensus algorithm and the decentralised PI current control) is experimentally validated in Section 5, after which the influence of the consensus update frequency on the convergence and the settling time is studied in Section 6.

| MODULAR MOTOR DRIVE ARCHITECTURE
The MMD architecture used throughout this work is presented in Figure 2. The MMD contains n motor modules and n PE modules.A motor module comprises a stator core element and a concentrated winding.Each motor module is fed by a dedicated PE module, which consists of a half-bridge inverter leg and a phase current measurement.The PE modules are connected in parallel to a DC power supply.The combination of a motor module and its dedicated PE module is called a PDU.In the case study that is used as an example throughout this work, the MMD consists of n = 15 of these PDUs.
The n PDUs are subdivided into p sets of three PDUs.These three PDUs are connected in one neutral point and are phase shifted with an electrical angle of 2π/3 rad.Each of the sets of three PDUs has its own dedicated, standard, three-phase (micro-)controller.The combination of the three PDUs and their controller is called an agent.An agent can hence be seen as a functional entity that can operate independently of the other agents in the system.The case study used throughout this work comprises p = 5 agents.The PDUs are arranged in such a way that the PDUs of neighbouring agents have a phase shift of 2π/ n = 2π/15 electrical radians between them.In concrete terms, this means that the phase angle ϕ a = 0 rad, ϕ b = −2π/15 rad, ϕ c = −4π/15 rad, and so on.
Finally, a communication network is established between the p agents of the MMD.In this work, each agent can only exchange data with its two closest neighbours, that is, agent x (∈ {1, …, p}) can only communicate with agents x − 1 and x + 1 (∈ {1, …, p}).This communication network is used to distribute a total torque demand among the different agents of the MMD, according to one of the three communication strategies depicted in Figure 1.The communication network is presented as a directed graph G ¼ ðV; EÞ, which consists of a set of vertices V ¼ f1; 2; ‥; pg and a set of directed edges E ¼ fðx; yÞjx; y ∈ Vg.The vertices represent the agents, and an edge (x, y) means that agent x can send information to agent y.In this work, the communication network hence comprises a set of five agents or vertices (V ¼ f1; 2; 3; 4; 5g) and a set of 10 edges or communication links (E = {(1, 5), (1, 2), (2, 1), (2, 3), (3,2), (3,4), (4,3), (4,5), (5,4), (5, 1)}).
Besides the mutual communication between the agents, also the total torque demand must be transferred from the user to the MMD.Only agents 1 and 3 can receive the total current reference I * q corresponding to this total torque demand directly from the user, as is depicted in Figure 2. The relation between the total torque demand and the total current reference I * q will be explained in Section 3.1.The gossip static average consensus algorithm presented in Section 3 will be used to divide this total current reference I * q into p agent-specific current set-points i * q;x (x ∈ {1, …, p}), based on local computations and the local interactions allowed by the communication network only.The conventional three-phase fieldoriented controller of each agent then regulates the currents through its three PDUs towards this individual current setpoint, as is illustrated in Figure 3 and elaborated upon in Section 4.

| GOSSIP STATIC AVERAGE CONSENSUS ALGORITHM
The goal of the distributed gossip consensus algorithm is to distribute a torque demand among the different agents of an MMD.Only local computations and neighbour-to-neighbour interactions are allowed for this purpose, in order to adhere to the flexibility and redundancy of MMDs.Three of the major advantages of the use of the gossip consensus algorithm presented in this section are � that each agent can take its own knowledge of its present condition into account during this decentralised torque allocation process, � that the algorithm is robust against random communication link failures, � and that perfect clock synchronisation between the controllers of the agents is not required.

| Definitions
The total torque demand T * em is directly proportional to a total current reference I * q for a surface-mounted permanent magnet synchronous machine (PMSM) under field-oriented control: N p is the number of pole pairs and Ψ d (Wb) is the flux linkage caused by the permanent magnets.When saturation is neglected and when all the motor modules are identical (i.e. they have the same number of windings, the same stator cores etc.), the distribution of the torque demand can hence be approached as the partitioning of the total current reference I * q into p current set-points i * q;x (x ∈ {1, ‥, p}) for the different agents.These individual current set-points must satisfy for all agents x, since no flux-weakening is considered in this article.For an interior PMSM, on the other hand, the total torque demand T * em should be split directly into multiple torque set-points t * em;x for the individual agents.A maximum-torque per ampere control method can then be used to compute the optimal set-points i * q;x and i * d;x for every agent [21].A similar consensus algorithm as the one presented in this work can hence be used for interior PMSMs as well, but with I * q replaced by T * em , and i * q;x replaced by t * em;x .However, this will not be elaborated specifically in this paper.
In order to be able to take the present condition of each agent into account in the distribution of the torque demand among the agents, each agent has its own cost function C x .In general, any quadratic function can be used as cost function [19,20], but, in this work, it is chosen that with R s,x (Ω) the stator winding resistance of agent x.Differences in R s,x between the agents are representative of different conditions of the agents.These differences can be due to, for instance, temperature differences or production tolerances or due to a fault in a PDU of an agent.The possibility to give every agent another value for R s,x allows for an unequal allocation of the torque demand among the p agents (i.e.not all i * q;x are equal), based on the current state of each agent.This feature cannot be obtained when using a simple master-slave control approach, in which one of the agents of the MMD is the master agent that continuously keeps track of the number of agents that can contribute to the torque demand and divides the current reference I * q into p equal set-points i * q ¼ I * q =p, after which this set-point is sent to all the agents by means of neighbour-to-neighbour communication.
In order to obtain a fair share of I * q among the agents, the so-called incremental cost γ x , defined as should be equal for all the agents x ∈ {1, …, p}, that is, When the agents reach agreement on their incremental costs, which is the goal of the consensus algorithm, this means that it would involve just as much effort for every agent to adapt its current set-point.
Finally, the current set-points must adhere to the current limits i q,min,x and i q,max,x for every agent x in the machine as well: These individual current limits can also be used to represent the present state of the agents.If, for instance, an opencircuit fault is detected in a PDU of agent f (either in a motor module or in a PE module) by means of the technique proposed in Ref. [22], this faulted state can be represented by setting i q,min,f = i q,max,f = 0 A. The current limits in the other agents remain unchanged.

| Communication strategies
The gossip consensus algorithm must ensure that the total torque demand is met (i.e. that Equation 2 is satisfied) and that the distribution is fair (i.e. that Equation 5 is satisfied).For this purpose, each agent makes a local estimate e x of the mismatch between the total reference I * q and the total delivered current P i * q;x .The agents exchange their parameters γ x and e x via the communication edges that are activated in the communication network at that moment.In one update period T c of the consensus algorithm, only one or two edges of the communication network are activated, depending on which communication strategy is used: � communication from agent i to agent j (i → j): only edge (i, j) is activated for the strategy of Figure 1a; � communication between agents i and j (i ↔ j): edges (i, j) and (j, i) are activated for the strategy of Figure 1b; � communication from agent m to agents i and j (m → i and j): edges (m, i) and (m, j) are activated for the strategy of Figure 1c, 486 - with i, j, m ∈ {1, …, p} and i ≠ j ≠ m.In this work, T c is fixed, and also the order in which the different edges are activated is predetermined in a sequence S (unless explicitly stated otherwise): � communication from i to j: S 1 = {(1, 2), (1, 5), (2, 1), (2, 3), (3, 2), (3,4), (4, 3), (4, 5), (5, 4), (5, 1)}; � communication between i and j: In summary, this means that under communication strategy i → j of Figure 1a, only one edge is activated during one period T c , and that all the edges have been activated once after 2p = 10 periods T c .Under communication strategies i ↔ j and m → i and j (Figure 1b,c), on the other hand, two edges are activated during one period T c .Completing the sequences S 2 and S 3 hence only takes p = 5 periods T c .Which communication strategy is used depends on the type of communication protocol that is established in the MMD.
It must be noted, however, that the gossip algorithm also converges when the edges are activated at random, when the period T c between two activations is not fixed, or when data packages are lost at random [18,20].As a result, the gossip consensus algorithm does not require perfect clock synchronisation between the micro-controllers of the different agents, which is one of its major advantages.

| Algorithm description
At each discrete time instant k (with a period T c between subsequent time instants), the gossip consensus algorithm updates the parameters γ x (k), i * q;x ðkÞ and e x (k) of the agents x depending on the chosen communication strategy.The general update for an agent x, based on information received from agent y, can be described as follows: with and γ min,x = 2R s,x i q,min,x and γ max,x = 2R s,x i q,max,x .The parameter ɛ is called the learning gain.It is a sufficiently small positive constant, which will be studied in Section 3.4.The parameters α, β, δ, ξ and η depend on the communication strategy and are tabulated in Table 1.These parameters ensure that algorithm (7) converges under every communication strategy [18,20].By using the parameters in Table 1, only the agents x ∈ {i, j, m} that send or receive data along the activated communication edges at instant k update their values for γ x , e x and i * q;x at that time instant.For the other agents l ∈ V − fi; j; mg, the values γ l , e l and i * q;l remain unchanged at that time instant.
The parameters γ x , i * q;x and e x are initialised as follows: U is the subgroup of the agents that can receive the total current reference I * q directly from the user, and U j j represents the number of agents in U.The number of agents in U is a trade-off between redundancy on the one hand and additional communication links and communication load on the other hand.For the case study used throughout this work, U ¼ f1; 3g, and hence, U j j ¼ 2. When the total current reference I * q is changed at time instant k 0 , the estimated mismatches e x (k 0 ) of the agents x ∈ U are updated as follows: The basic idea of algorithm (7) is that each agent makes a local estimate of the discrepancy between the total reference T A B L E 1 Overview of the parameters used in the general gossip consensus algorithm (7) for the different communication strategies illustrated in Figure 1 x y α β δ ξ η From agent i to agent j along edge (i, j) (Figure 1a) Between agents i and j along edges (i, j) and (j, i) (Figure 1b) From agent m to agents i and j along (m, i) and (m, j) (Figure 1c) -487 current I * q and the delivered current P i * q;x by means of the parameter e x , and that they share this parameter, together with their incremental cost γ x and with their neighbours.The agents update their own parameters γ x , i * q;x and e x only when they send or receive information.The information about I * q -which is initially only known by the agents that can communicate directly with the user according to (10)-is disseminated among all the agents by means of the term ηe y (k) in (7c).This term represents the communication between the agents.When, for instance, I * q is raised, update (10) and the term ηe y (k) in (7c) cause all estimated mismatches to rise.The term ɛδe x (k) in (7a) on his turn results in an increase in the incremental costs γ x of all the agents, causing i * q;x to rise according to (7b).The term −ði * q;x ðk þ 1Þ − i * q;x ðkÞÞ in (7c) prevents that i * q;x continues to rise when P i * q;x matches I * q again.The term βγ y (k) in (7a), which is the mathematical expression of the exchange of the incremental cost, results in consensus on this incremental cost, as expressed by (5).

| Learning gain design
Before consensus algorithm ( 7) can be executed, the learning gain ɛ should be tuned properly.This learning gain ɛ strongly influences the convergence speed of algorithm (7).From Refs [18,20], it can be concluded that, in general, for increasing ɛ > 0, the convergence speed first increases, after which it slows down again.However, a larger ɛ also results in larger changes in i * q;x and hence in overshoot and ripple in i * q;x .The selection of the learning gain ɛ is hence a trade-off between the convergence speed and the permissible variability in i * q;x .To find a proper value for this parameter, the effect of ɛ on the number of iterations that are required for convergence and on the variability in i * q;x is quantified in Figure 4 for the case study on the AFPMSM with five agents and the machine parameters listed in Table 2. Convergence is reached when and the variability in i * q;x is defined as with #steps the number of changes observed in i * q;x .Both quantities are averaged over a change ΔI * q of +15, −25 and +10 A and referred to their maximum value in the interval ε ∈ 0:007 0:055 ½ � to obtain p.u. values.The variability in i * q;x is also averaged over all x ∈ {1, …, p}.For all three communication strategies, the learning gain ɛ = 0.013 provides a good trade-off between convergence speed on the one hand, and ripple and overshoot in i * q;x on the other hand.
It must be noted, however, that the learning gain ɛ needs to be tuned offline, based on knowledge of global system information, such as the communication network topology and the number of agents.This central tuning process is not entirely in accordance with the decentralised approach in MMDs.But, Figure 5 shows that-when communication is restricted to data transfer between neighbouring agents only-a fixed value of the learning gain results in adequate performance for different numbers of agents p.This means that the learning gain is quite robust against the addition and the removal of agents.As the number of agents that can be added or removed from the control strategy during the operation of an MMD is restricted, the parameter ɛ can hence be tuned in such a way that it results in proper performance over the feasible range of numbers of agents p.The decentralised, online tuning of ɛ is out of the scope of this paper.

| Case study
The performance of the consensus algorithm ( 7) is simulated for the case study of the modular AFPMSM with 15 PDUs combined into five agents, with the machine parameters of Table 2 (unless explicitly stated otherwise) and ɛ = 0.013.Four operation conditions are studied for all three communication strategies depicted in Figure 1, that is, healthy operation, operation under an open-circuit fault in the PDU of an agent, operation under random communication link failures, and operation with different stator winding resistances R s,x for the different agents.

| Healthy operation
Under healthy operation conditions, the stator winding resistances R s,x and the current constraints i q,min,x = 0 A and i q,max,x = 10 A are the same for every agent x.The variation of the parameters i * q;x , γ x and e x of the five agents under healthy circumstances is depicted in the first 400 iterations shown in Figure 6.One iteration corresponds to an update period T c , which may or may not be a fixed constant.At iteration 20, the total current reference I * q (which is proportional to the total torque demand) increases from 0 to 20 A. The set-points i * q;x of the agents converge to I * q =p ¼ 4 A under all three communication strategies.The incremental costs γ x all converge to γ * ¼ I * q = P p x¼1 1= À À 2R s;x ÞÞ ¼ 0:52 W/A, and the estimated mismatches e x all converge to 0 A. The communication strategy of Figure 1a (which sends γ i and e i along edge (i, j) during one iteration) requires almost twice as many iterations to converge as the other two communication strategies.However, the reason for this phenomenon is that the communication strategy of Figure 1a activates only one communication edge during one iteration of the algorithm, whereas the other two communication strategies activate two communication edges during one iteration.If all communication edges in the network would be established by means of point-to-point connections, the strategies of (Figure 1b,c) would thus exchange twice as much data bits during one period T c .The update period T c of the communication strategy of Figure 1a could hence be halved to obtain the same overall bit rate as for these other two communication strategies.The convergence time would then F I G U R E 5 Effect of the number of agents p on the number of iterations (it.) that is required for convergence and on the variability (var.) in i * q;x under a fixed value of the learning gain ɛ (=0.013).The p.u. values in this figure are obtained using the same reference values as for Figure 4 (a) F I G U R E 6 current set-points i * q;x , the incremental costs γ x , and the estimated mismatches e x for a change in I * q from 0 to 20 A at iteration 20, and an open-circuit fault in agent 1 at iteration 400 (ɛ = 0.013).The variables are shown for the three communication strategies: (a) communication from agent i to agent j; (b) communication between agents i and j; (c) communication from agent m to agents i and j be similar for all three communication strategies.Apart from the differences in the variability in i * q;x that can be observed in Figures 4 and 6, the three strategies hence lead to comparable performance.The choice for a specific communication protocol (and thus for a specific communication strategy) should hence mainly depend on the compatibility between this communication protocol and the micro-controllers in the MMD.

| Operation under open-circuit fault in pole drive unit
Figure 6 also shows the variation of the parameters i * q;x , γ x and e x when an open-circuit fault occurs in a PE module of agent 1 at iteration 400, under a total current reference I * q of 20 A. The current constraints i q,min,1 and i q,max,1 are immediately set equal to zero, resulting in i * q;1 ð401Þ ¼ 0 A and an increase in e 1 (401).This causes the incremental costs γ x of all the agents to increase, until they converge to γ * 1failed ¼ I * q = P p x¼2 1= À ð2R s;x ÞÞ ¼ 0:65 W/A.The set-points i * q;x of the four healthy agents hence increase to I * q =4 ¼ 5 A, in order to meet the total current reference again.

| Operation under random communication link failures
When an edge (x, y) is activated to send information from agent x to agent y, the probability that this information is effectively sent along this edge is p x,y ∈ [0, 1], with p xy the parameter of a Bernoulli trial [20].The probability of link failure, due to for instance noise, is hence equal to (1 − p xy ).It is assumed that for any two communication edges of the communication network, the Bernoulli processes are independent and have the same probability to fail.When a communication link fails at time instant k, algorithm (7) is not executed, and the parameters γ x , i * q;x and e x are not updated during one update period T c of the consensus algorithm.At the next time instant k + 1, the edge(s) that follow next in the communication sequences S are activated.The effect of a communication link failure on the number of iterations that is required for convergence is summarised in Figure 7.To generate this figure, the gossip consensus algorithm is run 10,000 times for each communication strategy and for each failure probability (1 − p xy ), and the number of iterations required to satisfy (11) is computed.The error bars show the mean and the standard deviation.It can be concluded that the algorithm still converges under communication link failures but-on average-the convergence will be slower with increasing failure probability.The communication strategy of Figure 1a (i → j) is most affected by communication link failure, due to the fact that the failing edge is reactivated only after 10 update periods T c of the algorithm.For the other two communication strategies, reactivation of the failed edge only takes five update periods.

| Operation with non-identical agents
Due to for instance temperature differences, production tolerances, or faults, the stator winding resistance R s,x is not necessarily identical for all the agents in the MMD. Figure 8 shows the variation of the parameters i * q;x , γ x and e x for the case where R s,1 = 58.5 mΩ, R s,2 = 71.5 mΩ, R s,3 = 97.5 mΩ, R s,4 = 45.5 mΩ, R s,5 = 65 mΩ.Under the three different communication strategies, the incremental costs γ x converge to the same value of γ * ¼ I * q = P p x¼1 1= À À 2R s;x ÞÞ ¼ 0:51 W/A for all the agents x while the estimated mismatches e x all converge to 0 A. However, the current set-points i * q;x are not identical for all the agents: the lower the value of R s,x of agent x, the higher its set-point i * q;x .The torque is hence allocated unequally among the agents, but the total torque demand is still met, as P x i * q;x converges to I * q under all three communication strategies.

| Algorithm description
To complete the multi-agent control strategy, the gossip consensus algorithm is combined with the decentralised current control strategy in [9].Each agent gets a standard, dedicated, independent three-phase PI controller to reach its individual current set-points i * q;x and i * d;x ¼ 0 A, generated by the gossip consensus algorithm of Section 3.This strategy is depicted in Figure 3.
The PI parameters of the current controllers are tuned according to the state-of-the-art procedure described in Ref. [9], which takes the internal magnet coupling between the n motor modules and hence between the p agents, into consideration.This magnetic coupling is represented by the mutual inductances listed in Table 2.
The gossip consensus algorithm and the decentralised current control algorithm can be run at different time scales, that is, the update frequency f c of the consensus algorithm and F I G U R E 7 Effect of the probability (1 − p xy ) of communication edge failure on the number of iterations before convergence (ɛ = 0.013) the sample frequency f s of the PI controller do not need to be related.During each sample period T s , the PI controller of each agent checks only once which specific current set-point i * q;x has been computed by its gossip consensus controller.When a new value for i * q;x becomes available after this check, it will be taken into account during the next sample period.

| Case study
The performance of the proposed multi-agent control approach (which is the combination of consensus and decentralised PI control) is simulated for the case study of the modular AFPMSM with 15 PDUs combined into five agents.The performance is studied for both identical and nonidentical agents.

| Identical agents
A multi-variable frequency analysis as proposed in Ref. [9] is carried out, to find PI parameters that limit the effect of a setpoint change Δi * q;x in agent x to a deviation of the currents i q,y and i d,y in the other agents y with a magnitude of maximum 13% of the magnitude of Δi * q;x .All the agents are identical and perfectly synchronised, which means that they have the same stator winding resistances R s,x , the same sample and update frequencies f s and f c , and the same PI parameters.Figure 9 shows the simulation results for communication from agent m to agents i and j (see Figure 1c), under a sample frequency f s of 10 kHz, a consensus update frequency f c of 5 kHz, and a learning gain ɛ of 0.013.In one sample period T s = 1/f s , one iteration of the PI controller is executed, including the measurement of the phase currents through the PDUs and the generation of the pulses for the PE modules.One iteration of consensus algorithm (7), including the communication from agent m to agents i and j, is executed in one consensus update period T c = 1/f c .The effect of varying f c is discussed in Section 6.
The gossip consensus algorithm, indeed, makes sure that the total current reference I * q is distributed adequately among the agents, even when an open-circuit fault occurs in agent 1 after 0.4 s (i.e.P x i * q;x ¼ I * q ).Also under these faulty circumstances, the reference I * q can still be varied, as can be seen at t = 0.5 s.The decentralised PI controllers enable the agents to follow their current set-points well, causing the total torque demand to be met (i.e.P x i q;x ¼ I * q ).The other two communication strategies provide very similar simulation results and are therefore not shown explicitly.Figure 10 shows the simulation results when the agents of the MMD are not identical.The agent-specific parameters used for this simulation are listed in Table 3.The agents differ not only in stator winding resistances R s,x but also in clock rates of the controllers and in PI parameters.
The differences in R s,x result in an unequal current sharing among the agents, as was already illustrated in Section 3.5.4.
The different clock rates of the controllers are, for instance, caused by small discrepancies between the oscillators in the micro-controllers of the agents.When not compensated for by a clock synchronisation protocol, the sample frequencies f s,x and consensus update frequencies f c,x of the agents x ∈ {1, …, 5} are not identical.This results in a varying time period between the consecutive activations of the communication links and their corresponding updates of the consensus algorithm.The communicating agents can only update their consensus parameters γ x , i * q;x and e x when they have confirmed that all data transfers involved in one iteration of the consensus algorithm were successful.The difference in clock rates also results in a time offset between the sample periods of the different agents.The initial time offset between the sample period of agent 1 and agent x at the start of the simulation is denoted as τ OS,x in Table 3.The simulation results of Figure 10 illustrate that consensus algorithm (7) is still able to distribute the total current reference I * q among the agents, even when the controllers are not synchronised and the communicated information is hence reaching the agents at different points in their sample periods.
Furthermore, Figure 10 also shows that the decentralised PI controllers of Section 4.1 are able to follow their agentspecific current set-points (generated by the consensus algorithm of Section 3.3) under this asynchronous switching, even when the controllers of the agents have different PI parameters.This situation might occur when each agent is F I G U R E 9 Simulation results for ɛ = 0.013, N = 400 rpm, V DC = 48 V, f s = 10 kHz, f c = 5 kHz, under the communication strategy m → i and j of Figure 1c.The agents are all identical.An open-circuit failure occurs in agent 1 after 0.4 s F I G U R E 1 0 Simulation results under differences in stator winding resistances R s,x , PI parameters, and clock rates between the agents.The agent-specific parameters are listed in Table 3.An open-circuit failure occurs in agent 1 after 0.4 s (ɛ = 0.013, N = 400 rpm, V DC = 48 V, communication strategy i → j of Figure 1a) responsible for the tuning of its own PI parameters (independently of the other agents) and results in differences in dynamic behaviour between the agents.Table 3 lists the agent-specific natural frequencies ω n,x and damping factors ζ x of the PI controllers used for the simulation in Figure 10.The decentralised self-tuning of the PI controllers is out of the scope of this paper.
Figure 10 only shows the simulation results for communication strategy i → j, as the other two communication strategies provide very similar results.It can hence be concluded that the proposed multi-agent control approach is robust against occurring inequalities in MMDs.
For the remainder of this article, it is again assumed that the agents of the MMD are identical, that is, that they have the same R s,x , the same f s , the same f c , and the same PI parameters.

| EXPERIMENTAL VALIDATION
The 4-kW SAT axial-flux PMSM test set-up of Figure 11 is used to validate the simulation results of Section 4.2.1.The specifications of the AFPMSM are listed in Table 2.The AFPMSM is segmented in 15 identical motor modules, of which all 30 power terminals are accessible.The motor modules are divided into five groups of three modules, which are connected in five separate neutral points.The motor modules are fed by a modular low-voltage scalable power platform from Infineon.This power platform consists of five identical sets of three PE modules, which are connected in parallel to a DC power supply with V DC = 48 V.The combination of three motor modules, three dedicated PE modules, and three LEM LA 25-P current transducers forms an agent.The test set-up hence comprises five agents.
In order to avoid the limitations of a specific communication protocol, the gossip consensus algorithm of Section 3 and the decentralised PI controllers of Section 4 are implemented on the field programmable gate array of a single dSPACE MicroLabBox, but in a completely separate way.The controllers of the agents are perfectly synchronised.The controllers of neighbouring agents only exchange γ x and e x at a fixed, but freely variable communication frequency (and hence a fixed, but freely variable consensus update frequency f c ), and according to the selected communication strategy, as is shown in Figures 1 and 2. Section 6 elaborates on the effect of f c .
For the experimental validation, the AFPMSM is working in motor mode, with the induction motor drive shown in Figure 11 as load.This load motor maintains a constant mechanical speed N of 400 rpm.The learning gain ɛ is set equal to 0.013.The stator currents are measured with the 15 LEM LA 25-P current transducers in Figure 11, which also provide feedback to the PI controllers at a sampling rate f s of 10 kHz.The torque is measured with a Lorenz DR-2112 torque transducer with a measurement range of �50 Nm and an accuracy of 0.1%, mounted on the shaft between the AFPMSM and the load motor.The measurement results are drawn and recorded from the MicroLabBox by means of ControlDesk software at the same sampling rate of 10 kHz.
The experimental results for the three communication strategies are shown in Figures 12-14.The experiments for communication strategy i → j are conducted with f c = 10 kHz, whereas the experiments for the other two communication strategies are conducted with f c = 5 kHz.In this way, all the communication edges have been activated once (i.e.all neighbouring agents have exchanged γ x and e x once) in a time period of 1 ms for all three communication strategies.The obtained experimental results are very similar for the three communication strategies, and they show a good resemblance with the simulation results in Figure 9.The corresponding experimentally obtained phase current waveforms are shown in

| INFLUENCE OF CONSENSUS UPDATE FREQUENCY ON CONVERGENCE AND SETTLING TIME
The update frequency f c of the gossip consensus algorithm can be varied independently of the sample frequency f s of the decentralised PI current controllers.The simulated and experimentally obtained consensus convergence time T cons (ms) and PI settling time T PI (ms) for different choices of the update frequency f c of the consensus algorithm are shown in Figure 17 for the three different communication strategies.Both T cons and T PI are averaged over a change ΔI * q in total current reference of +15, −25 and +10 A. The parameter T cons is defined as the time it takes to satisfy requirements (11), whereas T PI is the time required to satisfy The PI settling time without the consensus algorithm (i.e.where each agent x immediately receives i * q;x ¼ I * q =5 A as setpoint) serves as a reference in Figure 17.
T cons and T PI both vary inversely proportional to f c in simulation, which implies that the rate � � dðΔI * q Þ=dt � � at which the total reference I * q can vary is inversely proportional to f c as well.For ɛ = 0.013, T cons and T PI are nearly twice as large for communication strategy i → j as for the other two communication strategies.This is due to the fact that only one communication edge is activated during one period T c for strategy i → j, whereas the other two communication strategies activate two communication edges during one period T c .It should be noted as well that T cons could be further reduced with a larger ɛ, but this would come at the cost of more ripple and overshoot in i * q;x , as is shown in Figure 4.The required � � dðΔI * q Þ=dt � � of the application can hence impose a lower bound on f c and ɛ, whereas the maximum allowable ripple and overshoot in i * q;x can impose an upper bound on ɛ.However, the measured PI settling times are consistently higher than the simulated PI settling times.This is due to the speed disturbances that occur when I * q (and hence the torque) is changed, which are not taken into account in the simulations.For f c > 5 kHz, f c has almost no effect on the measured PI settling times, for all three communication strategies.In other words, for f c > 5 kHz, it is not the consensus algorithm that limits the allowable � � dðΔI * q Þ=dt � � but the dynamic performance of the complete drive.On the other hand, f c is bounded by the chosen type of communication protocol as well.A multi-master Controller Area Network, for instance, limits the maximum bit rate to 1 Mb/s [23].If the estimated mismatches e x and the incremental costs γ x are represented by means of 16 bits, 2 � 16 = 32 bits must be transmitted over the network during one iteration T c of the consensus algorithm for the communication strategy m → i and j. f c can thus not exceed 10 6 /32 = 31.25 kHz.The simulations and measurements of Figures 9 and 14 show that this poses no problem for this specific case study, in which I * q changes every 100 ms: adequate results are already obtained for a consensus update frequency f c of 5 kHz, as the consensus algorithm converges in 15 ms.

| CONCLUSION
A multi-agent control strategy for modular motor drives was introduced in this work.The MMD was split into multiple agents with dedicated controllers, operating as synchronised peers.A gossip static average consensus algorithm was implemented in each agent for the decentralised distribution of a torque demand, which took into account the condition of each agent.Three feasible communication strategies were presented, under which only communication between neighbouring agents was allowed.The independent PI current controllers of the agents then regulated the phase currents of the MMD towards the agentspecific current set-points that were outputted by the consensus algorithm.Simulations and experimental results showed that the torque demand was delivered, even under an open-circuit fault in one of the agents, under random communication link failures and under differences between the agents.A study was conducted on the effect of the learning gain and the update frequency of the consensus algorithm on the consensus convergence time and the PI settling time of the decentralised controllers.This study provided useful insights into the allowable rate at which the torque demand could vary and in the feasible communication protocols.

F I G U R E 1
The three different communication strategies: the gossip algorithm is updated when (a) one agent has sent data to one of its two closest neighbours (e.g.agent 1 has sent data to agent 2); (b) two neighbouring agents have exchanged data (e.g.agents 1 and 2 have exchanged data); (c) one agent has sent data to its two closest neighbours (e.g.agent 1 has sent data to agents 5 and 2)

F I G U R E 4
Effect of the learning gain ɛ on the number of iterations (it.) that is required for convergence and on the variability (var.) in i * y−5 = L y,y+5 6.05 L y,y−6 = L y,y+6 6.07 L y,y−7 = L y,y+7 5.31 Flux linkage caused by magnets (Wb) Ψ d 0.0221

8
The current set-points i * q;x and their sum, the incremental costs γ x , and the estimated mismatches e x when the agents have different stator winding resistances R s,x .The variables are shown for the three communication strategies with ɛ = 0.013: (a) communication from agent i to agent j; (b) communication between agents i and j; (c) communication from agent m to agents i and j

Figure 15 . 3 2
Figure 15.For the sake of clarity of this figure, only the first phase current of each agent is shown (i.e. the currents of phase a, b, c, d, and e, which have a phase shift of 2π/15 electrical radians between them).The suggested multi-agent control approach (which combines consensus and decentralised PI control), indeed, delivers the demanded torque T * em ¼ 3 2 N p Ψ d I * q under the three communication strategies, even under an agent malfunction, as can be seen in the torque measurement of Figure 16.

F I G U R E 1 1 4 )F I G U R E 1 2
Test set-up consisting of a load motor and a modular axial-flux permanent magnet synchronous machine (AFPMSM) with 15 motor modules (1), grouped into five agents.Each agent is equipped with a set of three current sensors (2) and a dedicated three-phase power module (3) with its own DC-link connection (Experimental results for ɛ = 0.013, N = 400 rpm, V DC = 48 V, f s = 10 kHz, f c = 10 kHz, under communication strategy i → j of Figure 1a.An open-circuit failure occurs in agent 1 after 0.4 s F I G U R E 1 3 Experimental results for ɛ = 0.013, N = 400 rpm, V DC = 48 V, f s = 10 kHz, f c = 5 kHz, under the communication strategy i ↔ j of Figure 1b.An open-circuit failure occurs in agent 1 after 0.4 s F I G U R E 1 4 Experimental results for ɛ = 0.013, N = 400 rpm, V DC = 48 V, f s = 10 kHz, f c = 5 kHz, under the communication strategy m → i and j of Figure 1c.An open-circuit failure occurs in agent 1 after 0.4 s VERKROOST ET AL.

5 F I G U R E 1 6
Experimentally obtained phase current waveforms for ɛ = 0.013, N = 400 rpm, V DC = 48 V, f s = 10 kHz, f c = 10 kHz, under the three communication strategies.An open-circuit failure occurs in agent 1 after 0.4 s.(a) Strategy i → j of Figure 1a, (b) strategy i ↔ j of Figure 1b, and (c) strategy m → i and j of Figure 1c Experimentally obtained torque waveforms for ɛ = 0.013, N = 400 rpm, V DC = 48 V, f s = 10 kHz, with f c = 10 kHz for communication strategy i → j and f c = 5 kHz for strategies i ↔ j and m → i and j.An open-circuit failure occurs in agent 1 after 0.4 s F I G U R E 1 7 Simulated (S) and measured (M) consensus convergence time T cons and PI settling time T PI in function of a consensus update frequency f c (ɛ = 0.013, N = 400 rpm, V DC = 48 V, f s = 10 kHz).The measured results are averaged over 20 experiments.The simulated and measured PI settling times for the situation without the consensus algorithm are added as reference values Agent-specific parameters for the operation with nonidentical agents in Section 4.2.2 and Figure10 T A B L E 3