Lines Matching +full:a +full:- +full:facing
7 develop drivers for this subsystem as well as a TODO for developers interested
13 The Distributed Switch Architecture is a subsystem which was primarily designed
14 to support Marvell Ethernet switches (MV88E6xxx, a.k.a Linkstreet product line)
19 they configured/queried a switch port network device or a regular network
22 An Ethernet switch is typically comprised of multiple front-panel ports, and one
24 presence of a management port connected to an Ethernet controller capable of
25 receiving Ethernet frames from the switch. This is a very common setup for all
27 gateways, or even top-of-the rack switches. This host Ethernet controller will
33 ports are referred to as "dsa" ports in DSA terminology and code. A collection
34 of multiple switches connected to each other is called a "switch tree".
36 For each front-panel port, DSA will create specialized network devices which are
37 used as controlling and data-flowing endpoints for use by the Linux networking
41 The ideal case for using DSA is when an Ethernet switch supports a "switch tag"
42 which is a hardware feature making the switch insert a specific tag for each
46 - what port is this frame coming from
47 - what was the reason why this frame got forwarded
48 - how to send CPU originated traffic to specific ports
52 on Port-based VLAN IDs).
57 - the "cpu" port is the Ethernet switch facing side of the management
58 controller, and as such, would create a duplication of feature, since you
61 - the "dsa" port(s) are just conduits between two or more switches, and as such
63 downstream, or the top-most upstream interface makes sense with that model
66 ------------------------
68 DSA supports many vendor-specific tagging protocols, one software-defined
69 tagging protocol, and a tag-less mode as well (``DSA_TAG_PROTO_NONE``).
74 - identifies which port the Ethernet frame came from/should be sent to
75 - provides a reason why this frame was forwarded to the management interface
82 1. The switch-specific frame header is located before the Ethernet header,
85 2. The switch-specific frame header is located before the EtherType, keeping
88 3. The switch-specific frame header is located at the tail of the packet,
92 A tagging protocol may tag all packets with switch tags of the same length, or
94 require an extended switch tag, or there might be one tag length on TX and a
102 on a best-effort basis, the allocation of packets with enough extra space such
103 that the act of pushing the switch tag on transmission of a packet does not
106 Even though applications are not expected to parse DSA-specific frame headers,
110 ``struct dsa_device_ops`` with a value that uniquely describes the
117 switch tree use the same tagging protocol. In case of a packet transiting a
118 fabric with more than one switch, the switch-specific frame header is inserted
120 typically contains information regarding its type (whether it is a control
121 frame that must be trapped to the CPU, or a data frame to be forwarded).
128 by a leaf switch (not connected directly to the CPU) to not be the same as what
134 EDSA tagging protocol, the operating system sees EDSA-tagged packets from the
142 no DSA links in this fabric, and each switch constitutes a disjoint DSA switch
143 tree. The DSA links are viewed as simply a pair of a DSA master (the out-facing
144 port of the upstream DSA switch) and a CPU port (the in-facing port of the
164 The transmission of a packet goes through the tagger's ``xmit`` function.
165 The passed ``struct sk_buff *skb`` has ``skb->data`` pointing at
169 The job of this method is to prepare the skb in a way that the switch will
171 ports). Typically this is fulfilled by pushing a frame header. Checking for
176 The reception of a packet goes through the tagger's ``rcv`` function. The
177 passed ``struct sk_buff *skb`` has ``skb->data`` pointing at
180 method is to consume the frame header, adjust ``skb->data`` to really point at
181 the first octet after the EtherType, and to change ``skb->dev`` to point to the
182 virtual DSA user network interface corresponding to the physical front-facing
197 with DSA-unaware masters, mangling what the master perceives as MAC DA), the
201 Note that this assumes a DSA-unaware master driver, which is the norm.
204 ----------------------
207 the CPU/management Ethernet interface. Such a driver might occasionally need to
212 devices since they act as a pipe between the host processor and the hardware
216 ----------------------
218 When a master netdev is used with DSA, a small hook is placed in the
220 switch specific tagging protocol. DSA accomplishes this by registering a
221 specific (and fake) Ethernet type (later becoming ``skb->protocol``) with the
222 networking stack, this is also known as a ``ptype`` or ``packet_type``. A typical
229 - receive function is invoked
230 - basic packet processing is done: getting length, status etc.
231 - packet is prepared to be processed by the Ethernet layer by calling
237 if (dev->dsa_ptr != NULL)
238 -> skb->protocol = ETH_P_XDSA
243 -> iterate over registered packet_type
244 -> invoke handler for ETH_P_XDSA, calls dsa_switch_rcv()
248 -> dsa_switch_rcv()
249 -> invoke switch tag specific protocol handler in 'net/dsa/tag_*.c'
253 - inspect and strip switch tag protocol to determine originating port
254 - locate per-port network device
255 - invoke ``eth_type_trans()`` with the DSA slave network device
256 - invoked ``netif_receive_skb()``
262 ---------------------
265 device, each of these network interfaces will be responsible for being a
266 controlling and data-flowing end-point for each front-panel port of the switch.
269 - insert/remove the switch tag protocol (if it exists) when sending traffic
271 - query the switch for ethtool operations: statistics, link state,
272 Wake-on-LAN, register dumps...
273 - external/internal PHY management: link, auto-negotiation etc.
276 pointers which allow DSA to introduce a level of layering between the networking
281 invoke a specific transmit routine which takes care of adding the relevant
290 ------------------------
292 Summarized, this is basically how DSA looks like from a network device
299 +-----------v--|--------------------+
300 |+------+ +------+ +------+ +------+|
302 |+------+-+------+-+------+-+------+|
304 +-----------------------------------+
309 +-----------------------------------+
311 --------+-----------------------------------+------------
313 +-----------------------------------+
318 +-----------------------------------+
320 |+------+ +------+ +------+ +------+|
322 ++------+-+------+-+------+-+------++
325 --------------
327 In order to be able to read to/from a switch PHY built into it, DSA creates a
328 slave MDIO bus which allows a specific switch driver to divert and intercept
329 MDIO reads/writes towards specific PHY addresses. In most MDIO-connected
332 library and/or to return link status, link partner pages, auto-negotiation
341 ---------------
346 - ``dsa_chip_data``: platform data configuration for a given switch device,
347 this structure describes a switch device's parent device, its address, as
348 well as various properties of its ports: names/labels, and finally a routing
351 - ``dsa_platform_data``: platform device configuration data which can reference
352 a collection of dsa_chip_data structure if multiples switches are cascaded,
356 - ``dsa_switch_tree``: structure assigned to the master network device under
357 ``dsa_ptr``, this structure references a dsa_platform_data structure as well as
360 switch is also provided: CPU port. Finally, a collection of dsa_switch are
363 - ``dsa_switch``: structure describing a switch device in the tree, referencing
364 a ``dsa_switch_tree`` as a backpointer, slave network devices, master network
365 device, and a reference to the backing``dsa_switch_ops``
367 - ``dsa_switch_ops``: structure referencing function pointers, see below for a
374 -------------------------------
379 - inability to fetch switch CPU port statistics counters using ethtool, which
382 - inability to configure the CPU port link parameters based on the Ethernet
385 - inability to configure specific VLAN IDs / trunking VLANs between switches
386 when using a cascaded setup
389 --------------------------------
391 Once a master network device is configured to use DSA (dev->dsa_ptr becomes
392 non-NULL), and the switch behind it expects a tagging protocol, this network
393 interface can only exclusively be used as a conduit interface. Sending packets
394 directly through this interface (e.g.: opening a socket using this interface)
396 the Ethernet switch on the other end, expecting a tag will typically drop this
404 - MDIO/PHY library: ``drivers/net/phy/phy.c``, ``mdio_bus.c``
405 - Switchdev:``net/switchdev/*``
406 - Device Tree for various of_* functions
407 - Devlink: ``net/core/devlink.c``
410 ----------------
416 - internal PHY devices, built into the Ethernet switch hardware
417 - external PHY devices, connected via an internal or external MDIO bus
418 - internal PHY devices, connected via an internal MDIO bus
419 - special, non-autonegotiated or non MDIO-managed PHY devices: SFPs, MoCA; a.k.a
425 - if Device Tree is used, the PHY device is looked up using the standard
426 "phy-handle" property, if found, this PHY device is created and registered
429 - if Device Tree is used, and the PHY device is "fixed", that is, conforms to
430 the definition of a non-MDIO managed PHY as defined in
431 ``Documentation/devicetree/bindings/net/fixed-link.txt``, the PHY is registered
434 - finally, if the PHY is built into the switch, as is very common with
440 ---------
444 of per-port slave network devices. As of today, the only SWITCHDEV objects
448 -------
452 links or unused ports) is exposed as a devlink port.
456 - Regions: debugging feature which allows user space to dump driver-defined
457 areas of hardware information in a low-level, binary format. Both global
458 regions as well as per-port regions are supported. It is possible to export
460 to the standard iproute2 user space programs (ip-link, bridge), like address
462 contain additional hardware-specific details which are not visible through
464 the non-user ports too, which are invisible to iproute2 because no network
466 - Params: a feature which enables user to configure certain low-level tunable
468 devlink params, or may add new device-specific devlink params.
469 - Resources: a monitoring feature which enables users to see the degree of
471 - Shared buffers: a QoS feature for adjusting and partitioning memory and frame
473 directions, such that low-priority bulk traffic does not impede the
474 processing of high-priority critical traffic.
479 -----------
481 DSA features a standardized binding which is documented in
484 per-port PHY specific details: interface connection, MDIO bus location etc..
489 DSA switch drivers need to implement a dsa_switch_ops structure which will
499 --------------------
501 - ``tag_protocol``: this is to indicate what kind of tagging protocol is supported,
502 should be a valid value from the ``dsa_tag_protocol`` enum
504 - ``probe``: probe routine which will be invoked by the DSA platform device upon
505 registration to test for the presence/absence of a switch device. For MDIO
506 devices, it is recommended to issue a read towards internal registers using
507 the switch pseudo-PHY and return whether this is a supported device. For other
508 buses, return a non-NULL string
510 - ``setup``: setup function for the switch, this function is responsible for setting
515 a Port-based VLAN ID for each port and allowing only the CPU port and the
519 to issue a software reset of the switch during this setup function in order to
520 avoid relying on what a previous software agent such as a bootloader/firmware
524 -------------------------------
526 - ``get_phy_flags``: Some switches are interfaced to various kinds of Ethernet PHYs,
529 should return a 32-bits bitmask of "flags", that is private between the switch
532 - ``phy_read``: Function invoked by the DSA slave MDIO bus when attempting to read
535 status, auto-negotiation results, link partner pages etc..
537 - ``phy_write``: Function invoked by the DSA slave MDIO bus when attempting to write
538 to the switch port MDIO registers. If unavailable return a negative error
541 - ``adjust_link``: Function invoked by the PHY library when a slave network device
542 is attached to a PHY device. This function is responsible for appropriately
546 - ``fixed_link_update``: Function invoked by the PHY library, and specifically by
548 not be auto-negotiated, or obtained by reading the PHY registers through MDIO.
550 MoCA or other kinds of non-MDIO managed PHYs where out of band link
554 ------------------
556 - ``get_strings``: ethtool function used to query the driver's strings, will
559 - ``get_ethtool_stats``: ethtool function used to query per-port statistics and
564 - ``get_sset_count``: ethtool function used to query the number of statistics items
566 - ``get_wol``: ethtool function used to obtain Wake-on-LAN settings per-port, this
568 Wake-on-LAN settings if this interface needs to participate in Wake-on-LAN
570 - ``set_wol``: ethtool function used to configure Wake-on-LAN settings per-port,
573 - ``set_eee``: ethtool function which is used to configure a switch port EEE (Green
576 controller and data-processing logic
578 - ``get_eee``: ethtool function which is used to query a switch port EEE settings,
580 and data-processing logic as well as query the PHY for its currently configured
583 - ``get_eeprom_len``: ethtool function returning for a given switch the EEPROM
586 - ``get_eeprom``: ethtool function returning for a given switch the EEPROM contents
588 - ``set_eeprom``: ethtool function writing specified data to a given switch EEPROM
590 - ``get_regs_len``: ethtool function returning the register length for a given
593 - ``get_regs``: ethtool function returning the Ethernet switch internal register
594 contents. This function might require user-land code in ethtool to
595 pretty-print register values and registers
598 ----------------
600 - ``suspend``: function invoked by the DSA platform device when the system goes to
602 participating in Wake-on-LAN active as well as additional wake-up logic if
605 - ``resume``: function invoked by the DSA platform device when the system resumes,
606 should resume all Ethernet switch activities and re-configure the switch to be
607 in a fully active state
609 - ``port_enable``: function invoked by the DSA slave network device ndo_open
610 function when a port is administratively brought up, this function should be
611 fully enabling a given switch port. DSA takes care of marking the port with
612 ``BR_STATE_BLOCKING`` if the port is a bridge member, or ``BR_STATE_FORWARDING`` if it
615 - ``port_disable``: function invoked by the DSA slave network device ndo_close
616 function when a port is administratively brought down, this function should be
617 fully disabling a given switch port. DSA takes care of marking the port with
619 disabled while being a bridge member
622 ------------
624 - ``port_bridge_join``: bridge layer function invoked when a given switch port is
625 added to a bridge, this function should be doing the necessary at the switch
629 - ``port_bridge_leave``: bridge layer function invoked when a given switch port is
630 removed from a bridge, this function should be doing the necessary at the
636 - ``port_stp_state_set``: bridge layer function invoked when a given switch port STP
639 computing a STP state change based on current and asked parameters and perform
642 - ``port_bridge_flags``: bridge layer function invoked when a port must
647 flags when the port joins and leaves a bridge. DSA does not currently manage
650 CPU port, and flooding towards the CPU port should also be enabled, due to a
653 - ``port_bridge_tx_fwd_offload``: bridge layer function invoked after
654 ``port_bridge_join`` when a driver sets ``ds->num_fwd_offloading_bridges`` to
655 a non-zero value. Returning success in this function activates the TX
658 the port is a part of. Data plane packets are subject to FDB lookup, hardware
661 handled in hardware and the bridge driver will transmit a single skb for each
662 packet that needs replication. The method is provided as a configuration
666 - ``port_bridge_tx_fwd_unoffload``: bridge layer function invoken when a driver
667 leaves a bridge port which had the TX forwarding offload feature enabled.
670 ---------------------
672 - ``port_vlan_filtering``: bridge layer function invoked when the bridge gets
682 - ``port_vlan_add``: bridge layer function invoked when a VLAN is configured
684 supported by the hardware, this function should return ``-EOPNOTSUPP`` to
685 inform the bridge code to fallback to a software implementation.
687 - ``port_vlan_del``: bridge layer function invoked when a VLAN is removed from the
690 - ``port_vlan_dump``: bridge layer function invoked with a switchdev callback
691 function that the driver has to call for each VLAN the given port is a member
692 of. A switchdev object is used to carry the VID and bridge flags.
694 - ``port_fdb_add``: bridge layer function invoked when the bridge wants to install a
698 function should return ``-EOPNOTSUPP`` to inform the bridge code to fallback to
699 a software implementation.
702 of DSA, would be its port-based VLAN, used by the associated bridge device.
704 - ``port_fdb_del``: bridge layer function invoked when the bridge wants to remove a
709 - ``port_fdb_dump``: bridge layer function invoked with a switchdev callback
711 the given port. A switchdev object is used to carry the VID and FDB info.
713 - ``port_mdb_add``: bridge layer function invoked when the bridge wants to install
714 a multicast database entry. If the operation is not supported, this function
715 should return ``-EOPNOTSUPP`` to inform the bridge code to fallback to a
721 of DSA, would be its port-based VLAN, used by the associated bridge device.
723 - ``port_mdb_del``: bridge layer function invoked when the bridge wants to remove a
728 - ``port_mdb_dump``: bridge layer function invoked with a switchdev callback
730 the given port. A switchdev object is used to carry the VID and MDB info.
733 ----------------
737 DSA is capable of offloading a link aggregation group (LAG) to hardware that
739 as well as between LAGs. A bonding/team interface which holds multiple physical
740 ports constitutes a logical port, although DSA has no explicit concept of a
741 logical port at the moment. Due to this, events where a LAG joins/leaves a
744 state, etc) and objects (VLANs, MDB entries) offloaded to a LAG as bridge port
746 on all members of the LAG. Static bridge FDB entries on a LAG are not yet
747 supported, since the DSA driver API does not have the concept of a logical port
750 - ``port_lag_join``: function invoked when a given switch port is added to a
751 LAG. The driver may return ``-EOPNOTSUPP``, and in this case, DSA will fall
752 back to a software implementation where all traffic from this port is sent to
754 - ``port_lag_leave``: function invoked when a given switch port leaves a LAG
755 and returns to operation as a standalone port.
756 - ``port_lag_change``: function invoked when the link state of any member of
761 can optionally populate ``ds->num_lag_ids`` from the ``dsa_switch_ops::setup``
762 method. The LAG ID associated with a bonding/team interface can then be
763 retrieved by a DSA switch driver using the ``dsa_lag_id`` function.
765 IEC 62439-2 (MRP)
766 -----------------
768 The Media Redundancy Protocol is a topology management protocol optimized for
770 implemented as a function of the bridge driver. MRP uses management PDUs
771 (Test, Topology, LinkDown/Up, Option) sent at a multicast destination MAC
780 however in the case of a device with an offloaded data path such as DSA, it is
781 necessary for the hardware, even if it is not MRP-aware, to be able to extract
783 implementation. DSA today has no driver which is MRP-aware, therefore it only
787 - ``port_mrp_add`` and ``port_mrp_del``: notifies driver when an MRP instance
788 with a certain ring ID, priority, primary port and secondary port is
790 - ``port_mrp_add_ring_role`` and ``port_mrp_del_ring_role``: function invoked
795 IEC 62439-3 (HSR/PRP)
796 ---------------------
798 The Parallel Redundancy Protocol (PRP) is a network redundancy protocol which
801 eliminating the duplicates at the receiver. The High-availability Seamless
803 the redundant traffic are aware of the fact that it is HSR-tagged (because HSR
804 uses a header with an EtherType of 0x892f) and are physically connected in a
809 instantiates a virtual, stackable network interface with two member ports.
812 of RedBox and QuadBox are not implemented (therefore, bridging a hsr network
813 interface with a physical switch port does not produce the expected result).
815 A driver which is able of offloading certain functions of a DANP or DANH should
817 ``Documentation/networking/netdev-features.rst``. Additionally, the following
820 - ``port_hsr_join``: function invoked when a given switch port is added to a
821 DANP/DANH. The driver may return ``-EOPNOTSUPP`` and in this case, DSA will
822 fall back to a software implementation where all traffic from this port is
824 - ``port_hsr_leave``: function invoked when a given switch port leaves a
825 DANP/DANH and returns to normal operation as a standalone port.
831 -------------------------------------------------------------
834 capable hardware, but does not enforce a strict switch device driver model. On
835 the other DSA enforces a fairly strict device driver model, and deals with most
836 of the switch specific. At some point we should envision a merger between these
840 --------------------
842 - allowing more than one CPU/management interface: